[nylug-workshop] Log analyzer (Re: Regular meetings of the Python workshop)

Yusuke Shinyama yusuke at cs.nyu.edu
Tue Apr 10 03:37:50 EDT 2007


Hi,

I put up a newer version of logweeder.
http://www.unixuser.org/~euske/python/logweeder/

This version is significantly improved in its speed.
Now it can process thousands of lines per second, and
it can also load a previously constructed pattern file for further speedup.

The major drawback is that since the algorithm is now a greedy
way, it might not return optimal results.  For example, if the log
file is like as follows, line 1 and line 100000 are recognized as
different clusters:

  line 1: ypbind[1234]: aaaa
  line 2: ypbind[1234]: bbbb

    (cluster for "ypbind[1234]: ..." is formed)

    (then ypbind restarted for some reason)

  line 100000: ypbind[7890]: aaaa ...
  line 100001: ypbind[7890]: bbbb ...

    (another cluster for "ypbind[7890]: ..." is formed)

I haven't yet incorporated Peter's patch because the program
structure is rather different, althoug I mostly incorporated the
general idea. (and personally I'm not fond of using configParser.)

Yusuke


More information about the nylug-workshop mailing list