Discussion:
Cricket trouble/optimizing
Sigurd Mytting
2010-12-08 17:06:14 UTC
Permalink
Hi!

Running NAV 3.7.0 on Debian 5-64-bit, about 700 switches and routers
currently in NAV.

There has been a lot of weird stuff (trouble) happening with Cricket
after going from the latest in the 3.5-series up to 3.7, some things got
better from 3.6 to 3.7, but not all is perfect. Not complaining, just a
summary from my point of view over the last months of releases.

About a week ago my Cricket graphs just stopped and after some debugging
I found the Cricket-cronjob run as navcron dies after a while and leaves
a lock-file, cleaning it up and doing some optimizing (separating switch
and router interfaces, and system values into separate cronjobs) I found
the cronjobs still dies and leaves lock-files. First of all; anyone
else seeing the Cricket collector dying? (In my case just about on
every run.)

Second; Anyone have a good tip on how to tune Cricket to run a sensible
number of jobs to get thru it all in a sensible amount of time?
Currently, if the largest job (switch interfaces) don't die after only a
few minutes it usually runs for 25-30 minutes.

NAV in my environment isn't really a production system, but my
pet-project provides more accurate and sensible information than any
system we have in production so I'll hang on to it as long as I can.

Any input appreciated!

Cheers,

-Sigurd
Sigurd Mytting
2010-12-09 23:17:34 UTC
Permalink
Post by Sigurd Mytting
Hi!
Second; Anyone have a good tip on how to tune Cricket to run a sensible
number of jobs to get thru it all in a sensible amount of time?
Currently, if the largest job (switch interfaces) don't die after only a
few minutes it usually runs for 25-30 minutes.
Got a tip of running 50ish devices per Cricket-collector, this seems to
both stop the collector from dying and lets Cricket finish before it's
next run.

Cheers,

-Sigurd
Stokkenes Vidar
2010-12-10 08:08:28 UTC
Permalink
And FYI, as I've requested the exact same thing before (and thought there was already a blueprint on this) I took myself the liberty to submit a blueprint on Launchpad to add this feature.

-Vidar

-----Opprinnelig melding-----
Fra: Sigurd Mytting [mailto:sigurd-***@public.gmane.org]
Sendt: 10. desember 2010 00:18
Til: nav-users-***@public.gmane.org
Emne: Re: Cricket trouble/optimizing
Post by Sigurd Mytting
Hi!
Second; Anyone have a good tip on how to tune Cricket to run a sensible
number of jobs to get thru it all in a sensible amount of time?
Currently, if the largest job (switch interfaces) don't die after only a
few minutes it usually runs for 25-30 minutes.
Got a tip of running 50ish devices per Cricket-collector, this seems to
both stop the collector from dying and lets Cricket finish before it's
next run.

Cheers,

-Sigurd
Morten Brekkevold
2010-12-17 09:30:51 UTC
Permalink
Post by Sigurd Mytting
Post by Sigurd Mytting
Second; Anyone have a good tip on how to tune Cricket to run a sensible
number of jobs to get thru it all in a sensible amount of time?
Currently, if the largest job (switch interfaces) don't die after only a
few minutes it usually runs for 25-30 minutes.
Got a tip of running 50ish devices per Cricket-collector, this seems
to both stop the collector from dying and lets Cricket finish before
it's next run.
Glad to see your problem was solved. I'll still add a few comments.

I've no experience with Cricket consistently crashing and leaving its lock
files behind. I do have experience with Cricket going bananas and eating all
available RAM and running forever until its config tree is recompiled.

If you set your cricket to log debug info, can you glean any idea of why it
crashes from the logs?

Also, when NTNU originally added the Cricket integration to NAV, they were
completely unable to collect traffic statistics from all their access ports -
there was just too many of them to complete rounds in anything remotely close
to five minutes.

They decided not to collect stats for access ports, and that is how the EDGE
category was born. An EDGE device is the same as an SW device, except that
no Cricket configuration is generated for switch-ports on EDGE devices.

I do remember talk of someone who wrote a program to automatically split the
Cricket config tree into sizable chunks so that multiple collectors could run
in parallel and complete collection in a timely manner. There's also the
issue of optimizing RRD writes, which can be horribly inefficient when
attempting to scale up.
--
Morten Brekkevold
UNINETT
Loading...