> > It depends.... Sometimes you'd like to keep all the logging (e.g., think > of firewalls), so it's easier to filter out interesting ones. What I've > done on my machine (which acts like a firewall and log host for several > other machines... please do not start the discussion that it's a bad > idea mixing the log host and firewall on one machine :-) is generating > new log files nightly (crontab entry) and saving the old ones. > > I made a script using awk to find events in the syslog file that might > be interesting. The logic of the scripts is something like: > > For each of the machines > find the entries in the syslog file that are generated by the machine > extract interesting events > collect filter rejection messages (as I said, it's also a firewall) > ignore standard events > extract what's left > If anything interesting was found, e-mail me otherwise send confirmation > that script was run. > It doesn't matter how much log files you process. It is a general principle that you should drop all non-interesting ones after which the interesting ones remain. It's like defensive programming. When you write programs only accepting certain strings, you should only allow the valid ones. If you forget one of the valid strings, it does never break security. On the other hand, when you disallow the non-valid ones and you forget one you do break security. In you case, when you forget a few loglines indicating a hacker you will not find him. As you already said, you have large logfiles. This means you will never inspect them line by line. I do admit that making the set of regexp's filtering out the non-interesting ones is a hard process but it's worth it. Here at Origin, We have like 160Meg per day of log files on only a small amount of firewalls and we do it the way I described with lots of success. Of course you need the correct tools to efficiently filter the logfiles but that's another story. -Guido