This is something very much like what we tried around 1984 at Sytek; those reports may still be available. We concluded that the approach was too crude to get a good detection rate and a low false alarm rate. The primary problem is that there is not enough data in lastcomm or unix accounting or any of those easily available places. But another real problem is that the approach is too hard and fast: you can't really say that if a user never did X before that when you see him do X it must be an intruder. Generally, users have patterns of behavior in which they do X very infrequently, but with some probability. You should let them do X with that probability without raising an alarm. Only if you start seeing them do it much more frequently should you raise an alarm. This gives you a much better picture of what's normal for a person (you'll have the relative probabilities of various actions or categories of actions), and greatly reduces the falsa alarm rate. But you have to do a lot more bookkeeping, and the analysis is much more complicated than "does he or doesn't he." Then you get into nuances, such as that a person's behavior changes over time, there are different patterns on weekends or off hours than during the day or week, etc. You will get better and better detection and false alarm rates the further you go along this path. We did it with many iterations of IDES prototypes, and then NIDES. You ask what areas need more research and development. I think most of the intrusion detection systems out there have not demonstrated, either in comprehensive experiments or in real operation, that they have satisfactory detection rates and false alarm rates. It is not too much of a stretch to say that in all cases, all we can go on are the claims of the makers. It would be a VERY interesting research study if someone were to take these systems and put them through their paces, so to speak, with a series of very comprehensive and realistic experiments. By realistic I mean approximating as close as possible the conditions of actual use (where most activity is normal, and there are a few bad things happening here and there buried in the audit trails). By comprehensive I mean that as many different attack and misuse types are tested as possible. Some systems, like NIDES, allow you to augment their rulebases so that if an attack type is not detected you can add it to the rulebase so that next time you can detect it. Most others, like the Digital one, and the one from Haystack Laboratories (called Stalker) you can't augment the rulebase, so comprehensive tests should give an accurate picture of the limitations of such systems. Unlike the others, NIDES has a very strong statistical analysis components, and test of NIDES (and of the other systems) should include comprehensive and realistic tests to see whether departures from normal behavior can be detected with satisfactory detection and false alarm rates. Teresa ================= Date: Wed, 31 Aug 1994 13:49:19 -0400 From: David R Landry <dlandry@afit.af.mil> Subject: Account Profiling with UNIX commands > Anyone have any thoughts on how to build an account profile so that a sudden > change in behaviour will be obvious? I am working on profiling users based on their use of certain commands. The input to my program is logs based on the lastcomm command such as: rusers X dlandry ttyp0 0.16 secs Thu Aug 25 18:47 mps dlandry ttyp0 0.14 secs Thu Aug 25 18:47 rn dlandry ttyp0 1.16 secs Thu Aug 25 18:42 My output after processing the logs looks like this: --------------------------------------------------------------- username Com Unix1 Unix2 Mail News Info Ed Pro Int HAR BAD --------------------------------------------------------------- user1 X X X X X user2 X X user3 X X X X X X X X user4 X X X X user5 X X X X -------------------------------------------------------------- Com indicates commonly used UNIX commands / programs such as sh and csh. Unix1 indicates use of commands such as mv, cp. Unix2 indicates use of commands such as awk, find. Mail indicates use of any mailing tools. News indicates use of any news group readers. Ed indicates use of the editors vi or emacs. Pro indicates use of programming languages such as c++, cc. Int indicates use of internet commands such as telnet, ftp, etc. HAR and BAD look for specific attack signature related commands. After comparing two weeks of data, 83 % of the users had the exact same profile or a subset of their profile. The others added a category or two, indicating their profile was not complete or "a possible intrusion." I realize this is a very primitive method of security and many systems out there (IDES, NIDES, DIDS, Haystack) are light years beyond this in statistically analyzing users. I also realize that most system administrators do not have these tools. My question to the intrusion group is this. Am I going in a valid direction ? Are you all thinking - yeah, that's kind of neat or we did that about 10 years ago ? Do investigations into user profiling still need to be done ? Have all the problems been solved in this area ? I am very familiar with IDS out there, I would just like a response as to what areas need more research and development. -------------------------------------- 2LT David R. Landry Graduate Student, AI/Computer Security Air Force Institute of Technology dlandry@afit.af.mil