that do something similar to what you are interested in, I think.
had a chance to describe them yet.
share my experiences and help to make them better.
The undocumented program rauserdata() will analyze the user
data portion of argus records, and generate a signature pattern
configuration for the protocols it encounters in the data set.
The algorithm is very simple, and pretty powerful, in that it makes
no assumptions about the user data. But, you have to be careful
with the data that you give the engine. Below is a little background.
I have found that for the set of protocols you listed, the
first 32 bytes of data is all that is needed to reliably identify the
protocol type. This is because, each of your protocols have
unique greeting identifiers, and for the ones you listed, the
identifiers are all in ascii.
Because argus provides multiple status reports for long lived
flows, not all argus records for a given flow will contain the
"first N byte" signatures that you are seeking.
Using racluster() on your 'primitive' argus data will usually provide
you with the "first N bytes" of user data, so that your search for
tokens and patterns can be reliable.
Try this out for a while to see if you get anything useful:
racluster -r /a/days/worth/of/data/of/interest/* -w /tmp/day.cache
rauserdata -r /tmp/day.cache | less
You should get an output that is arraigned by 'protocol/port' and
you should see a set of source and destination user data buffers
that have the "greatest likelyhood" patterns for that "proto/port" pair.
ra() prints the user data buffer with an ASCII encoding by default, and
so you should see some patterns in the buffers it outputs.
if you see a '.', that is generally a non printable character.
The ides is to build up configuration files of signatures using rauserdata(),
and the program raservices(), will take the rauserdata() output as
a configuration file, and label flows with the tags that identify the
Give it a try, and I'd love to see/hear your comments.
On Feb 3, 2009, at 12:59 AM, Oguz Yarimtepe wrote:
Depends on what you need. If you enable user data capture (the -U
option on the argus) it will capture up to the first 512 bytes of the user
data of the flow. That may or may not give you enough information about the
flow to do what you want. Note that on a fast link best results are going to
occur using a DAG card as the data capture adds a fairly heavy load to the
server. To display the data with ra (for instance) you need to use the -s
command to add suser and duser to the output (as in
ra -r argus_file -n -s +suser:512 -s +duser:512
which will tack the user data on the end of the line. This of course raises a
number of sticky privacy issues that you need to have considered and gotten
approved by appropriate management of the link you are tapping (which may or
may not be you ).
Peter Van Epp
What i am willing to do is to characterize the network traffic by using some characteristics derived from flow information. My final decision about a flow record will be whether the flow belongs to a chat session, a mail transfer, a FTP connection, a web browsing, ...
I had discovered Bro which has identifiers related with high level protocols. The protocol family that it supports is not as much as Argus does so i was planning to go on with Argus.
150 E 57th Street Suite 12D
New York, New York 10022
+1 212 588-9133 Phone
+1 212 588-9134 Fax