Nick Diel | 1 Mar 02:45

Re: racluster() memory control Re: New To Argus

Carter,

Thanks for the information.  I have been playing around with the timeout period with great success, though what is the status entry for?  If this is documented somewhere, I apologize, but I couldn't find it.

I think the radark() method is quite clever, but in my situation I am not able to do that (yet).  I am capturing data at a transit provider and immediately anonymizing the data.  I don't have access to know which subnets are darks, but I will investigate if I can find one.  I do think this could be a powerful research tool for me.

For now after I merge status flows, I think I will create a filter to purge the port scans from some of my outputs.

Nick

Carter Bullard wrote:
The default idle timeout is infinity.

I think if you pre-process the stream with something like radark()
which provides you with the IP addresses of scanners, you can
reject traffic involving those addresses or you just filter traffic
that is going to IP addresses that don't exist, you will do well.

We have limits in the code, I just need to reduce the number so it
doesn't kill the average machine.  We have means of passing the
limit to the clients as well, in the .rarc file, so that should be easy
to do.

Carter



On Feb 28, 2008, at 4:59 PM, Nick Diel wrote:

Carter,

I am going to start playing around with idle=timeout, if that parameter is not specified is there a default value used or will all flows stay in cache?  Though this parameter looks very promising for my use.

Where we do most of our capturing we can see millions of port scans in a 12 hour trace, so that is an issue for us too when we do flow filtering.  I wonder if a separate timeout would be useful for flows that just have a syn?  Basically trying to purge out port scans faster.  Or maybe in a memory constraint model these flows are picked first to be outputted.

I also think flushing out "complete" tcp flows is a good idea too.  Maybe a second timeout should be in place for these flows (it would be shorter than the regular timeout and potentially 0), that way if you wanted you could capture anomalies such as duplicate fin acks.  This second timer could also be used for flows that have a reset, since it is very common to see additional packets after a reset (packets still on the wire, reset gets lost, etc.).

Finally, I can see how a memory limit could be beneficial.  While yes it does create a problem where results are going to be influenced on size of memory available, it will allow for processing that may not otherwise be possible (or at least easily doable).  When I was producing a list of flows on port 25, I had to use very aggressive filters to handle memory issues and I know I missed some flows anyways.  We ended up with 20 million plus flows for our 12 hour capture.  I would have been willing to give a memory limit and had known that possibly not all flows would have been combined properly.  In my case at least I would expect most flows outputted early due to memory constraints would have been port scans and complete flows that haven't reached their idle timeout yet.  So again this would be a site specific option.  Per your list:

   1. filter input data
   2. change the flow model definition to reduce the number of flows,
   3. use the "idle=timeout" option on items in the racluster.conf file.
   4. use memory limit for very large data sets with knowledge it could affect the actual output.

Basically a memory limit is used when the others are not working.  It just allows for processing that may not have been easily possible.

Nick


Carter Bullard wrote:
Hey Nick,
I can put memory limits into racluster(), but then there is the possibililty
that you get different results based on the available space.  I'm not sure
that is the right way to go, but who knows, it maybe a great option.

The trick to keeping racluster memory use down is to:
   1. filter input data
   2. change the flow model definition to reduce the number of flows,
   3. use the "idle=timeout" option on items in the racluster.conf file.  

This all needs to be customized for each site, so working with the
racluster.conf file is the way to go, and running different .conf files
against one sample test data file allows you fine tune the configuration.

Getting darknet traffic out of the mix is important.  For many sites
"all the flows" are really scans, and should be ignored, 99.999%
of the time.  I track if something new responds to a scan, not that the
scan exists, because there is always a scan, because many of the sites
that I pay attention to have literally 100,000's of scans a day.  As a
result, we want to pay attention to the originator of the scan, the scan
type, if the addresses involved are real, if its coming from inside or outside
and if there was a "new" response.  Sloughing scan traffic to tools
that do scan analysis, and tracking the other flows makes this
a doable thing, and programs like radark()/rafilteraddr() help here
(but they are just examples).

For traffic you really want to track, modifying the flow model allows
us to reduce the number of flow caches, say, by ignoring the source
port for flows going to well known servers.  Lines like this:

   filter="dst host 1.2.3.4 and src pkts eq 1 and dst pkts eq 1 and dst port 53" model="saddr/16 daddr proto dport"

will reduce the number of in memory caches for this DNS server to
just the number of class B networks hitting the server.  The filter
only applies to successful connections, and the resulting aggregation
record will contain stats like the number of connections and avgdur.
(I'm assuming here that you will run racluster() against a file).

You can have literally 1000's of these filters to generate the data
you really want.

When a flow hits its idle time, racluster will write out the record (if
needed) and deallocate the memory used for that flow.  So having
aggressive idle times is very helpful.

But now that I'm thinking about it, we don't really have a way of
flushing records based on the flow state for flows that are being merged
together.  What I mean by that, is that if a TCP connection has 12 status
records, and we finally get the last one that "closes" the connection,
there isn't a way, currently, for us to check to see if that flow should
be "flushed".

Possibly we should take the resulting merged record, and run it back
through the filters to see if there is something we need to do with it.

Well, anyway, keep sending email if any of this is useful.

Carter



On Feb 28, 2008, at 1:25 PM, Nick Diel wrote:

Carter,

Thanks for all of your input.  Also thanks for the updated Argus.

After reading what you said, I can understand why Argus was designed the way it was.  I was just initially evaluating Argus with some very simple and discrete examples.  Looking at some of the source code also helped me wrap my head around Argus.

On to the memory issue.  The system I am using has 2GB in it and racluster wants to use all of it.  When 1.7GB< starts to get used by racluster heavy swapping occurs and racluster's cpu usage drops below 25%.  So this was why I was thinking out loud about potentially giving racluster a memory limit from the command line.  This way the system could avoid the heavy swapping and just have racluster write out the oldest records before moving on.

Again thanks for putting up with me as I start to understand Argus.

Nick

Carter Bullard wrote:
Hey Nick,
The problem with packet capture, is primarily the disk performance.
Argus can go as fast as you can collect packets, assuming that
you're using Endace cards, and although argus does great in the
presence of packet loss, it generates its best results when it gets
everything.

The best architecture is to run argus on the packet capture box,
and to blow argus records to another machine that does the disk
operations to store records.  This division of labor works best for
the 10Gbps capture facilities.

We sort input file names in the ra* programs, so doing this for
argus is a cut, copy, paste job.  No problem, I'll put it in this week.

Argus can read from stdin.

There are many incantations that work to decrease the memory
demands of an argus client.  Just really need to know what it is
that you want to do.

OK, to your question.  

Now let me ask about what I have been working on (merging flows across argus data files).  First, if I was capturing with Argus (not reading pcap files, capturing off the wire: argus | raspilt) wouldn't I run into the same problem of having flows broken up across different argus files?

If racluster is merging records as it finds them (not reading all records into memory first), it seems it might be nice to specify a  memory limit for racluster at command line.  Then as racluster approaches the memory limit it could remove the oldest records from memory and print them to the output.

Multiple argus status records spanning files.  Well, yes that is the actual design
goal.  When you think about most types of operations/security/performance
analysis, you want to see flow data scoped over some time boundary.  Regardless
of what that boundary is, whether its the last millisecond, second or minute or hour,
you will have flows that span that boundary.   There are a lot of flows that are
persistent, so you can't have a file big enough to hold complete flows, ....,
really.

But you don't seem to be too interested in really granular data, so you should
modify the ARGUS_FAR_STATUS_INTERVAL value to be something larger
than your file duration.  That way argus generates  only one record per flow
per file.  You use ra() to split files that are complete from those that may
continue into the next file, using the "-E" option and then after you're done
with all the files you have, then run racluster() against these continuation files.

   for i in *pcap; do argus -S 5m -r $i -w $i.argus; done
   for i in *argus; do ra -r $i -E $i.cont -w argus.out - tcp and ((syn or synack) and (fin or finack or reset)); done
   racluster -r *.cont -w argus.out

They won't be sorted, but thats easy to do with an additional step:
   rasplit -M nomodify -r argus.out -M time 5m -w data/argus.%Y.%m.%d.%H.%M.%S
   rm argus.out
   rasort -R data -M replace
   ra -R data -w argus.out
   rm -rf data

Or at least something like that should work.  The "-M nomodify" is critical, as rasplit()
with break records up into time boundaries if you don't specify this option, which
puts you back in trouble, if you're really trying to keep the flows together.

Argus clients aren't suppose to consume more than, what 1GB of memory, so there
are limits in the code.  Do you have a smaller machine than that?


Carter


On Feb 25, 2008, at 2:01 PM, Nick Diel wrote:

Carter,

First of all thanks for your detailed response and updated clients.  And I am glad you like twists.

Let me tell you a little bit more about the research setup.  The research project I am part of (made up of several universities in the US) has several collection boxes in different large commercial environments.  The boxes were customized specifically for high speed packet capturing (RAID, Endace capture card, etc.).  We will run a 12 hour capture and then analyze the capture for some time.  Sometimes up to several months.  So I do have time to correctly create my argus output files and do any other processing I need to do.

Some of the researchers focus on packet based research, where as other parts of the group focus more on flow based analysis.  So Argus looks like a great match for us.  Immediately after the capture, we can create Argus flow records and do our flow analysis with Argus clients.

So for my first question, is Argus capable of capturing at high line speeds (at least 1Gbit) where doing a packet capture using libpcap and a standard NIC may fail (libpcap dropping packets)?  Or since Argus is flow based it doesn't care if it misses packets?  Some of the anomalies we research require us to account for almost every packet in the anomaly, so say dropping every 100th or even every 1000th packet could hamper us.  The reason I ask I about Argus high speed captures, is if it is very capable at high speeds, it would allow us to deploy more collection boxes (these boxes would then primarily be used by the flow based researchers).  We wouldn't have to buy an expensive capture card for each collection box.

As for reading multiple files into Argus, one easy way to accomplish this would have Argus be able to read pcap files from stdin.  Then one can use a utility such as mergecap or tcpslice to feed Argus a list of out of order files: mergecap -r /packets/*.pcap -w - | argus -r - ....

My files are named so chronological order equals lexical order so argus -r * would work in my case (this helps us with a number of utilities we use).  I do understand actually implementing this in Argus would require probably a number of things such as dieing when files are out of order and then telling the user what order argus was reading the files.  Though doing this would be quite faster then having tcpslice or mergecap feed Argus the pcap files.

Now let me ask about what I have been working on (merging flows across argus data files).  First, if I was capturing with Argus (not reading pcap files, capturing off the wire: argus | raspilt) wouldn't I run into the same problem of having flows broken up across different argus files?

If racluster is merging records as it finds them (not reading all records into memory first), it seems it might be nice to specify a  memory limit for racluster at command line.  Then as racluster approaches the memory limit it could remove the oldest records from memory and print them to the output.

I was able to use your suggestion successfully to merge most of my flows together.  Though I needed to make a few modifications to the filter.  I moved parenthesis, "tcp and ((syn or synack) and ((fin or finack) or reset))" vs. "tcp and (((syn or synack) and (fin or finack)) or reset)."  And I added "not con" to filter out the many, many packet scans, though this also does not merge syn-synack flows which exist at the end of the argus output files.  This filter still caused most of the memory to be used, but not a whole lot of time was spent in the upper range where swapping was slowing the system to a crawl.  Without "not con" I would reach the upper limits of memory usage quite fast and go into a crawl with the swapping.

Thanks again for all your help,
Nick


Carter Bullard wrote:
Hey Nick,
The argus project from the very beginning has been trying
to get people away from capturing packets, and instead
capturing comprehensive flow records that account for every
packet on the wire.  This is because capturing packets at modern
speeds seems impractical, and there are a lot of problems that can
be worked out without all that data.

So to use argus in the way you want to use argus is a bit of a
twist on the model.  But I like twists ;o)print

>>> To start out with something simple I want to be able to count the number of flows over TCP port 25.

The easiest way to do that right now is to do something like this in bash:

   % for i in pcap*; do argus -r $i -w - - tcp and port 25 | \
        rasplit -M time 5m -w - argus.data/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S ; \
        done

That will put the tcp:25  "micro flow" argus records into a manageable
set of files.  Now the files themselves need to be processed to
get the flows merged together:

   % racluster -M replace -R argus.data

So now you'll get the data needed to ask questions, split into 5m bins,
so to speak.  Changing the "5m" to "1h", "4h", or "1d", may generate
file structures that you can work with, but eventually you will hit a memory
wall. Without doing something clever.

Now that you have these intermediate files, in order to merge the
tcp flows that span multiple files, you will need to give racluster()
a different aggregation strategy than the default.  Try a
racluster.conf file that contains these lines against the argus files
you have.

------- start racluster.conf ---------

filter="tcp and ((syn or synack) and ((fin or finack) or reset))"  status=-1 idle=0
filter="" model="saddr daddr proto sport dport"

------- end racluster.conf --------

What this will do is:
   1. any tcp connection that is complete, where we saw the beginning and the
       end, just pass it through, don't track anything.
   2. any partial tcp connection, track and merge records that match.

So it only allocates memory for flows that are 'continuation' records.
The output is unsorted, so you will need to run rasort() if you want
to do any time oriented operations on the output.

In testing this, I found a problem with parsing "-1" from the status
field in some weird conditions, so I fixed it.  Grab the newest
clients from the dev directory if you want to try this method.

ftp://qosient.com/dev/argus-3.0/argus-clients-3.0.0.rc.69.tar.gz

Give that a try, and send email to the list with any kind of result
yiou get.

With so many pcap files, we probably need to make some other
changes.

The easiest way for you to do what you eventually want do,
would be for you to say something like this:
   argus -r * -w - | rawhatever

This current won't work, and there is a reason, but maybe we
can change it.  Argus currently can read multiple input files, but you
need to specify each file using a "-r filename -r filename " like command
line list.   With 1000's of files, that is somewhat impractical.  It is this
way on purpose, because argus really does need to see packets in time order.

If you try to do something like this:

   argus -r * -w - | rasplit -M time 5m -w argus.out.%Y.%m.%d.%H.%M.%S

which is designed generate argus record files that represent packet
behavior with hard cutoffs every 5 minutes, on the hour;    if the
packet files are not read in time order, you get really weird
results.  It's as if the realtime argus was jumping into the future and
then into the past and then back to the future again.

Now, if you name your pcap files so they can be sorted, I can
make it so "argus -r *" can work.  How do you name your pcap files?


Because argus has the same timestamps as the packets in your
pcap files, the timestamps can be used as an "external key" if
you will.  If you build a database that has tuples (entries) like:

   "pcap_filename start_time end_time"

then by looking at a single argus record, which has a start time
and an end time, you can  find the pcap files that contain its packets.
And with something like perl and tcpdump or wireshark, you can
feed a simple shell to look in those pcap files looking for packets
with this type of filter:

   ( ether host $smac and $dmac) and (host $saddr and $daddr) and ports \
   ($sport and $dport)

and you get all the packets that are referenced in the record.


Carter




On Feb 21, 2008, at 4:49 PM, Nick Diel wrote:

I am new to Argus, but have found it has great potential for the research project I work on.  We collect pcap files from several high traffic networks (20k-100k packets/second).  We collect for approximately 12 hours and have ~1000 pcap files that are roughly 500MB each.
I am wanting to do a number of different flow analysis and think Argus might be perfect for me.  I am having a hard time grasping some of the fundamentals of Argus, but I think once I get some of the basics I will be able to really start to use Argus.

To start out with something simple I want to be able to count the number of flows over TCP port 25.  I know I need to use RACluster to merge the Argus output (I have one argus file for each pcap file I have),  that way I can combine identical flow records into one.  I can do this fine on one argus output file, but I know many flows span the numerous files I have.  I also know I can't load all the files at once into RACluster as it fills all available memory.  So my question is how can I accomplish this while making sure I capture most flows that span multiple files.

Once I understand this, I hope to be able to do things like create a list of flow sizes (in bytes) for port 25.  Basically I will be asking a lot of questions involving all flows that match a certain filter and I am not sure how to accommodate for flows spanning multiple files.

A separate question.  I don't think Argus has this ability, but I wanted to know if the community already had a utility for this.  I am looking into creating a DB of some sort that would match Argus's flow IDs to pcap file name(s) and packet numbers.  This way one could extract the packets for a flow that needed further investigation.

And finally, thanks for the great tool.  It does a number of things I have been doing manually for a while.

Thanks,
Nick










Nick Diel | 1 Mar 02:56

Filter Flows with fins/resets

I am interested in finding flows that contain fin-finacks or resets and where data packets continue in the flow after the fin-finacks or reset (usually the data packets continue only in one direction).  I know I can filter for flows that contain fin-finacks or resets, but finding flows with the previous criteria is stumping me.  I am guessing this information is not easily available just from the argus flow records, so just a filter will probably not work.  I am looking at using racluster to help me.  Here is my current thought process:
  1. Merge status flow records only up to the point of a fin-finak or reset (not sure if this is possible)
  2. Take all flows that just contain data packets and most likely resets, but no syns or fins, and merge them with the above flows.
  3. Any flows that did merge successfully will be the flows I am interested in.
Any ideas or thoughts would be appreciated.

Nick


Nick Diel | 1 Mar 19:41

Re: racluster() memory control Re: New To Argus

Carter,

I found the man page that describes status (for people searching on the list man 5 racluster).

Thanks,
Nick

Nick Diel wrote:
Carter,

Thanks for the information.  I have been playing around with the timeout period with great success, though what is the status entry for?  If this is documented somewhere, I apologize, but I couldn't find it.

I think the radark() method is quite clever, but in my situation I am not able to do that (yet).  I am capturing data at a transit provider and immediately anonymizing the data.  I don't have access to know which subnets are darks, but I will investigate if I can find one.  I do think this could be a powerful research tool for me.

For now after I merge status flows, I think I will create a filter to purge the port scans from some of my outputs.

Nick

Carter Bullard wrote:
The default idle timeout is infinity.

I think if you pre-process the stream with something like radark()
which provides you with the IP addresses of scanners, you can
reject traffic involving those addresses or you just filter traffic
that is going to IP addresses that don't exist, you will do well.

We have limits in the code, I just need to reduce the number so it
doesn't kill the average machine.  We have means of passing the
limit to the clients as well, in the .rarc file, so that should be easy
to do.

Carter



On Feb 28, 2008, at 4:59 PM, Nick Diel wrote:

Carter,

I am going to start playing around with idle=timeout, if that parameter is not specified is there a default value used or will all flows stay in cache?  Though this parameter looks very promising for my use.

Where we do most of our capturing we can see millions of port scans in a 12 hour trace, so that is an issue for us too when we do flow filtering.  I wonder if a separate timeout would be useful for flows that just have a syn?  Basically trying to purge out port scans faster.  Or maybe in a memory constraint model these flows are picked first to be outputted.

I also think flushing out "complete" tcp flows is a good idea too.  Maybe a second timeout should be in place for these flows (it would be shorter than the regular timeout and potentially 0), that way if you wanted you could capture anomalies such as duplicate fin acks.  This second timer could also be used for flows that have a reset, since it is very common to see additional packets after a reset (packets still on the wire, reset gets lost, etc.).

Finally, I can see how a memory limit could be beneficial.  While yes it does create a problem where results are going to be influenced on size of memory available, it will allow for processing that may not otherwise be possible (or at least easily doable).  When I was producing a list of flows on port 25, I had to use very aggressive filters to handle memory issues and I know I missed some flows anyways.  We ended up with 20 million plus flows for our 12 hour capture.  I would have been willing to give a memory limit and had known that possibly not all flows would have been combined properly.  In my case at least I would expect most flows outputted early due to memory constraints would have been port scans and complete flows that haven't reached their idle timeout yet.  So again this would be a site specific option.  Per your list:

   1. filter input data
   2. change the flow model definition to reduce the number of flows,
   3. use the "idle=timeout" option on items in the racluster.conf file.
   4. use memory limit for very large data sets with knowledge it could affect the actual output.

Basically a memory limit is used when the others are not working.  It just allows for processing that may not have been easily possible.

Nick


Carter Bullard wrote:
Hey Nick,
I can put memory limits into racluster(), but then there is the possibililty
that you get different results based on the available space.  I'm not sure
that is the right way to go, but who knows, it maybe a great option.

The trick to keeping racluster memory use down is to:
   1. filter input data
   2. change the flow model definition to reduce the number of flows,
   3. use the "idle=timeout" option on items in the racluster.conf file.  

This all needs to be customized for each site, so working with the
racluster.conf file is the way to go, and running different .conf files
against one sample test data file allows you fine tune the configuration.

Getting darknet traffic out of the mix is important.  For many sites
"all the flows" are really scans, and should be ignored, 99.999%
of the time.  I track if something new responds to a scan, not that the
scan exists, because there is always a scan, because many of the sites
that I pay attention to have literally 100,000's of scans a day.  As a
result, we want to pay attention to the originator of the scan, the scan
type, if the addresses involved are real, if its coming from inside or outside
and if there was a "new" response.  Sloughing scan traffic to tools
that do scan analysis, and tracking the other flows makes this
a doable thing, and programs like radark()/rafilteraddr() help here
(but they are just examples).

For traffic you really want to track, modifying the flow model allows
us to reduce the number of flow caches, say, by ignoring the source
port for flows going to well known servers.  Lines like this:

   filter="dst host 1.2.3.4 and src pkts eq 1 and dst pkts eq 1 and dst port 53" model="saddr/16 daddr proto dport"

will reduce the number of in memory caches for this DNS server to
just the number of class B networks hitting the server.  The filter
only applies to successful connections, and the resulting aggregation
record will contain stats like the number of connections and avgdur.
(I'm assuming here that you will run racluster() against a file).

You can have literally 1000's of these filters to generate the data
you really want.

When a flow hits its idle time, racluster will write out the record (if
needed) and deallocate the memory used for that flow.  So having
aggressive idle times is very helpful.

But now that I'm thinking about it, we don't really have a way of
flushing records based on the flow state for flows that are being merged
together.  What I mean by that, is that if a TCP connection has 12 status
records, and we finally get the last one that "closes" the connection,
there isn't a way, currently, for us to check to see if that flow should
be "flushed".

Possibly we should take the resulting merged record, and run it back
through the filters to see if there is something we need to do with it.

Well, anyway, keep sending email if any of this is useful.

Carter



On Feb 28, 2008, at 1:25 PM, Nick Diel wrote:

Carter,

Thanks for all of your input.  Also thanks for the updated Argus.

After reading what you said, I can understand why Argus was designed the way it was.  I was just initially evaluating Argus with some very simple and discrete examples.  Looking at some of the source code also helped me wrap my head around Argus.

On to the memory issue.  The system I am using has 2GB in it and racluster wants to use all of it.  When 1.7GB< starts to get used by racluster heavy swapping occurs and racluster's cpu usage drops below 25%.  So this was why I was thinking out loud about potentially giving racluster a memory limit from the command line.  This way the system could avoid the heavy swapping and just have racluster write out the oldest records before moving on.

Again thanks for putting up with me as I start to understand Argus.

Nick

Carter Bullard wrote:
Hey Nick,
The problem with packet capture, is primarily the disk performance.
Argus can go as fast as you can collect packets, assuming that
you're using Endace cards, and although argus does great in the
presence of packet loss, it generates its best results when it gets
everything.

The best architecture is to run argus on the packet capture box,
and to blow argus records to another machine that does the disk
operations to store records.  This division of labor works best for
the 10Gbps capture facilities.

We sort input file names in the ra* programs, so doing this for
argus is a cut, copy, paste job.  No problem, I'll put it in this week.

Argus can read from stdin.

There are many incantations that work to decrease the memory
demands of an argus client.  Just really need to know what it is
that you want to do.

OK, to your question.  

Now let me ask about what I have been working on (merging flows across argus data files).  First, if I was capturing with Argus (not reading pcap files, capturing off the wire: argus | raspilt) wouldn't I run into the same problem of having flows broken up across different argus files?

If racluster is merging records as it finds them (not reading all records into memory first), it seems it might be nice to specify a  memory limit for racluster at command line.  Then as racluster approaches the memory limit it could remove the oldest records from memory and print them to the output.

Multiple argus status records spanning files.  Well, yes that is the actual design
goal.  When you think about most types of operations/security/performance
analysis, you want to see flow data scoped over some time boundary.  Regardless
of what that boundary is, whether its the last millisecond, second or minute or hour,
you will have flows that span that boundary.   There are a lot of flows that are
persistent, so you can't have a file big enough to hold complete flows, ....,
really.

But you don't seem to be too interested in really granular data, so you should
modify the ARGUS_FAR_STATUS_INTERVAL value to be something larger
than your file duration.  That way argus generates  only one record per flow
per file.  You use ra() to split files that are complete from those that may
continue into the next file, using the "-E" option and then after you're done
with all the files you have, then run racluster() against these continuation files.

   for i in *pcap; do argus -S 5m -r $i -w $i.argus; done
   for i in *argus; do ra -r $i -E $i.cont -w argus.out - tcp and ((syn or synack) and (fin or finack or reset)); done
   racluster -r *.cont -w argus.out

They won't be sorted, but thats easy to do with an additional step:
   rasplit -M nomodify -r argus.out -M time 5m -w data/argus.%Y.%m.%d.%H.%M.%S
   rm argus.out
   rasort -R data -M replace
   ra -R data -w argus.out
   rm -rf data

Or at least something like that should work.  The "-M nomodify" is critical, as rasplit()
with break records up into time boundaries if you don't specify this option, which
puts you back in trouble, if you're really trying to keep the flows together.

Argus clients aren't suppose to consume more than, what 1GB of memory, so there
are limits in the code.  Do you have a smaller machine than that?


Carter


On Feb 25, 2008, at 2:01 PM, Nick Diel wrote:

Carter,

First of all thanks for your detailed response and updated clients.  And I am glad you like twists.

Let me tell you a little bit more about the research setup.  The research project I am part of (made up of several universities in the US) has several collection boxes in different large commercial environments.  The boxes were customized specifically for high speed packet capturing (RAID, Endace capture card, etc.).  We will run a 12 hour capture and then analyze the capture for some time.  Sometimes up to several months.  So I do have time to correctly create my argus output files and do any other processing I need to do.

Some of the researchers focus on packet based research, where as other parts of the group focus more on flow based analysis.  So Argus looks like a great match for us.  Immediately after the capture, we can create Argus flow records and do our flow analysis with Argus clients.

So for my first question, is Argus capable of capturing at high line speeds (at least 1Gbit) where doing a packet capture using libpcap and a standard NIC may fail (libpcap dropping packets)?  Or since Argus is flow based it doesn't care if it misses packets?  Some of the anomalies we research require us to account for almost every packet in the anomaly, so say dropping every 100th or even every 1000th packet could hamper us.  The reason I ask I about Argus high speed captures, is if it is very capable at high speeds, it would allow us to deploy more collection boxes (these boxes would then primarily be used by the flow based researchers).  We wouldn't have to buy an expensive capture card for each collection box.

As for reading multiple files into Argus, one easy way to accomplish this would have Argus be able to read pcap files from stdin.  Then one can use a utility such as mergecap or tcpslice to feed Argus a list of out of order files: mergecap -r /packets/*.pcap -w - | argus -r - ....

My files are named so chronological order equals lexical order so argus -r * would work in my case (this helps us with a number of utilities we use).  I do understand actually implementing this in Argus would require probably a number of things such as dieing when files are out of order and then telling the user what order argus was reading the files.  Though doing this would be quite faster then having tcpslice or mergecap feed Argus the pcap files.

Now let me ask about what I have been working on (merging flows across argus data files).  First, if I was capturing with Argus (not reading pcap files, capturing off the wire: argus | raspilt) wouldn't I run into the same problem of having flows broken up across different argus files?

If racluster is merging records as it finds them (not reading all records into memory first), it seems it might be nice to specify a  memory limit for racluster at command line.  Then as racluster approaches the memory limit it could remove the oldest records from memory and print them to the output.

I was able to use your suggestion successfully to merge most of my flows together.  Though I needed to make a few modifications to the filter.  I moved parenthesis, "tcp and ((syn or synack) and ((fin or finack) or reset))" vs. "tcp and (((syn or synack) and (fin or finack)) or reset)."  And I added "not con" to filter out the many, many packet scans, though this also does not merge syn-synack flows which exist at the end of the argus output files.  This filter still caused most of the memory to be used, but not a whole lot of time was spent in the upper range where swapping was slowing the system to a crawl.  Without "not con" I would reach the upper limits of memory usage quite fast and go into a crawl with the swapping.

Thanks again for all your help,
Nick


Carter Bullard wrote:
Hey Nick,
The argus project from the very beginning has been trying
to get people away from capturing packets, and instead
capturing comprehensive flow records that account for every
packet on the wire.  This is because capturing packets at modern
speeds seems impractical, and there are a lot of problems that can
be worked out without all that data.

So to use argus in the way you want to use argus is a bit of a
twist on the model.  But I like twists ;o)print

>>> To start out with something simple I want to be able to count the number of flows over TCP port 25.

The easiest way to do that right now is to do something like this in bash:

   % for i in pcap*; do argus -r $i -w - - tcp and port 25 | \
        rasplit -M time 5m -w - argus.data/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S ; \
        done

That will put the tcp:25  "micro flow" argus records into a manageable
set of files.  Now the files themselves need to be processed to
get the flows merged together:

   % racluster -M replace -R argus.data

So now you'll get the data needed to ask questions, split into 5m bins,
so to speak.  Changing the "5m" to "1h", "4h", or "1d", may generate
file structures that you can work with, but eventually you will hit a memory
wall. Without doing something clever.

Now that you have these intermediate files, in order to merge the
tcp flows that span multiple files, you will need to give racluster()
a different aggregation strategy than the default.  Try a
racluster.conf file that contains these lines against the argus files
you have.

------- start racluster.conf ---------

filter="tcp and ((syn or synack) and ((fin or finack) or reset))"  status=-1 idle=0
filter="" model="saddr daddr proto sport dport"

------- end racluster.conf --------

What this will do is:
   1. any tcp connection that is complete, where we saw the beginning and the
       end, just pass it through, don't track anything.
   2. any partial tcp connection, track and merge records that match.

So it only allocates memory for flows that are 'continuation' records.
The output is unsorted, so you will need to run rasort() if you want
to do any time oriented operations on the output.

In testing this, I found a problem with parsing "-1" from the status
field in some weird conditions, so I fixed it.  Grab the newest
clients from the dev directory if you want to try this method.

ftp://qosient.com/dev/argus-3.0/argus-clients-3.0.0.rc.69.tar.gz

Give that a try, and send email to the list with any kind of result
yiou get.

With so many pcap files, we probably need to make some other
changes.

The easiest way for you to do what you eventually want do,
would be for you to say something like this:
   argus -r * -w - | rawhatever

This current won't work, and there is a reason, but maybe we
can change it.  Argus currently can read multiple input files, but you
need to specify each file using a "-r filename -r filename " like command
line list.   With 1000's of files, that is somewhat impractical.  It is this
way on purpose, because argus really does need to see packets in time order.

If you try to do something like this:

   argus -r * -w - | rasplit -M time 5m -w argus.out.%Y.%m.%d.%H.%M.%S

which is designed generate argus record files that represent packet
behavior with hard cutoffs every 5 minutes, on the hour;    if the
packet files are not read in time order, you get really weird
results.  It's as if the realtime argus was jumping into the future and
then into the past and then back to the future again.

Now, if you name your pcap files so they can be sorted, I can
make it so "argus -r *" can work.  How do you name your pcap files?


Because argus has the same timestamps as the packets in your
pcap files, the timestamps can be used as an "external key" if
you will.  If you build a database that has tuples (entries) like:

   "pcap_filename start_time end_time"

then by looking at a single argus record, which has a start time
and an end time, you can  find the pcap files that contain its packets.
And with something like perl and tcpdump or wireshark, you can
feed a simple shell to look in those pcap files looking for packets
with this type of filter:

   ( ether host $smac and $dmac) and (host $saddr and $daddr) and ports \
   ($sport and $dport)

and you get all the packets that are referenced in the record.


Carter




On Feb 21, 2008, at 4:49 PM, Nick Diel wrote:

I am new to Argus, but have found it has great potential for the research project I work on.  We collect pcap files from several high traffic networks (20k-100k packets/second).  We collect for approximately 12 hours and have ~1000 pcap files that are roughly 500MB each.
I am wanting to do a number of different flow analysis and think Argus might be perfect for me.  I am having a hard time grasping some of the fundamentals of Argus, but I think once I get some of the basics I will be able to really start to use Argus.

To start out with something simple I want to be able to count the number of flows over TCP port 25.  I know I need to use RACluster to merge the Argus output (I have one argus file for each pcap file I have),  that way I can combine identical flow records into one.  I can do this fine on one argus output file, but I know many flows span the numerous files I have.  I also know I can't load all the files at once into RACluster as it fills all available memory.  So my question is how can I accomplish this while making sure I capture most flows that span multiple files.

Once I understand this, I hope to be able to do things like create a list of flow sizes (in bytes) for port 25.  Basically I will be asking a lot of questions involving all flows that match a certain filter and I am not sure how to accommodate for flows spanning multiple files.

A separate question.  I don't think Argus has this ability, but I wanted to know if the community already had a utility for this.  I am looking into creating a DB of some sort that would match Argus's flow IDs to pcap file name(s) and packet numbers.  This way one could extract the packets for a flow that needed further investigation.

And finally, thanks for the great tool.  It does a number of things I have been doing manually for a while.

Thanks,
Nick











Marten Bauer | 3 Mar 13:52
Picon

Re: graph of bytes against protocols for network loop detection?

Hallo Carter,

thanks for your help and the gnuplot script.
Last week I tried to code an plot with
python/matplot and did the following.

1. Copy argus.logs from Server to my 
workstation
2. Split the logfiles into hourly basis 
(to isolate the moment when the
network loop appeares etc.)
    The result are hundreds of files
3. racluster the hundred of files to get 
a distribution of bytes against
protocols:
   "racluster -m proto -r %s -s proto 
sbytes dbytes spkts dpkts load >
%s"%(inputfile,outputfile)
4. read the files and create a data 
structure
5. Plot this data into various plots

It's working fine with 2d plots and 
today I will try to make an 3d plot.

Is it possible to do step 2. and 3. in 
an easier way to get the same result?

Thx for helping

Carter Bullard schrieb:
> Hey Marten,
> Here is a simple gnuplot plot file that will generate a graph
> of 'Total Bytes By Protocol" using argus data.   This graphs src and
> dst bytes per protocol separately, if you want just total bytes,
> then the change is really simple.
>
> There are a few things that you will want to modify, like adding
>  a date string to the title, etc, but this should be a good start for 
> you.
>
> So assuming your gnuplot is installed in /opt/local/bin/gnuplot
> (change the first line if this needs to be changed), put the included
> script in the file 'barchart.bytesxproto.plt" and then:
>
>    % chmod 755 barchart.bytesxproto.plt
>    % racluster -m proto -r argus.out -s proto spkts dpkts sbytes 
> dbytes > racluster.dat
>    % ./barchart.bytesxproto.plt
>
> And you'll get a window that pops up with a graph in it.
>
> If you want to discuss how to get other graphs out of argus data,
> just send email to the list and we'll talk about it.
>
> Carter
>
> ------ begin barchart.bytesxproto.plt ------
> #!/opt/local/bin/gnuplot -persist
> #
> #       G N U P L O T
> #       Version 4.2 patchlevel 2
> #       last modified 31 Aug 2007
> #       System: Darwin 9.2.0
> #
> #       Copyright (C) 1986 - 1993, 1998, 2004, 2007
> #       Thomas Williams, Colin Kelley and many others
> #
> #       Type `help` to access the on-line reference manual.
> #       The gnuplot FAQ is available from http://www.gnuplot.info/faq/
> #
> #       Send bug reports and suggestions to 
> <http://sourceforge.net/projects/gnuplot>
> #
> #
> reset
> #
> # Create simple barchart of Total Bytes by Protocol
> # The racluster.dat file was generated using:
> #
> #     racluster -m proto -r argus.out -s proto spkts dpkts sbytes dbytes
> #
> # And is of the format:
> #
> # Proto  SrcPkts  DstPkts     SrcBytes     DstBytes
> #   pim    53267    18086     48793554      1085160
> #  ospf     1764        0       213220            0
> #  [more]
> #
> set termoption font "Verdana, 12"
> set size square 0.90,0.90
> set bmargin 4
> set title "Total Bytes By Protocol" font "Verdana,22"
> set style data histogram
> set style histogram cluster gap 1
> set style fill solid border -1
> set tics font "Verdana,14"
> set boxwidth 0.80
> set grid
> set ylabel "Log Total Bytes" font "Verdana,18"
> set logscale y 10
> set auto y
> set label 1 "Generated by Argus using Gnuplot"
> set label 1 at graph 1.02, 0.62 rotate by 90 font "Verdana,9"
> #
> set key autotitle columnhead
> plot 'racluster.dat' using 4:xticlabels(1) ti col, \
>      ''              using 5 ti col
> #
>
>
> ------ end barchart.bytesxproto.plt ------
>
>
> On Feb 27, 2008, at 1:52 AM, Marten Bauer wrote:
>
>> Hello,
>>
>> for detecting network loops I need a graph which
>> prints the protocol on the x axes and the amount of
>> bytes on the y axes.
>>
>> I tried to archive this with ragraph, but I never got
>> what I want.
>>
>> Is it possible with ragraph or another ra* tool to
>> generate such plot?
>>
>> Thx for helping
>> Marten
>>
>

Peter Van Epp | 3 Mar 18:51
Picon
Picon
Favicon
Gravatar

Re: racluster() memory control Re: New To Argus

On Sat, Mar 01, 2008 at 11:41:09AM -0700, Nick Diel wrote:
> Carter,
> 
> I found the man page that describes status (for people searching on the 
> list man 5 racluster).
> 
> Thanks,
> Nick
> 
> Nick Diel wrote:
> >Carter,
> >
> >Thanks for the information.  I have been playing around with the 
> >timeout period with great success, though what is the status entry 
> >for?  If this is documented somewhere, I apologize, but I couldn't 
> >find it.
> >
> >I think the radark() method is quite clever, but in my situation I am 
> >not able to do that (yet).  I am capturing data at a transit provider 
> >and immediately anonymizing the data.  I don't have access to know 
> >which subnets are darks, but I will investigate if I can find one.  I 
> >do think this could be a powerful research tool for me.

	Although I haven't had time to play with radark yet, I don't think the
anonymization will matter to it. As I understand it, it selects dark IPs by
doing a prescan for IPs that don't respond to anything in the time period and
then reprocesses the flows looking at traffic to the dark IPs on the assumption
they are attacks. I don't think anonymization will affect that and you should
be able to feed information back to the TX about what kind of attacks they 
are seeing and if they look to be having success (i.e. an attacking IP making
a successful connection to a real host and doing a fair amount of traffic).
	It may motivate the TX to install argus for themselves with non 
anonmized data to figure out who the attacked hosts are if the cost looks 
high enough, or it may reassure them that their security measures are keeping
the noise at an acceptable level (thats essentially what my argus traffic 
scripts do for us on our campus). Either of those things could be of value
to them and an encouragement to let you keep capturing traffic from them.		We have a data mining research
project here that produced such 
interesting results from the anonymized data that the agency owning the data
created a test system to run against the real data to extract and action 
against the real data and is now talking about installing a parallel system 
using non anonymized data beside our anonymized one (that only their people 
have access to) to be able to always do that now that the value has been
demonstrated (which was part of the reason for the original project). 

Peter Van Epp / Operations and Technical Support 
Simon Fraser University, Burnaby, B.C. Canada

Nick Diel | 3 Mar 19:06

Re: racluster() memory control Re: New To Argus

Peter Van Epp wrote:
On Sat, Mar 01, 2008 at 11:41:09AM -0700, Nick Diel wrote:
Carter, I found the man page that describes status (for people searching on the list man 5 racluster). Thanks, Nick Nick Diel wrote:
Carter, Thanks for the information. I have been playing around with the timeout period with great success, though what is the status entry for? If this is documented somewhere, I apologize, but I couldn't find it. I think the radark() method is quite clever, but in my situation I am not able to do that (yet). I am capturing data at a transit provider and immediately anonymizing the data. I don't have access to know which subnets are darks, but I will investigate if I can find one. I do think this could be a powerful research tool for me.
Although I haven't had time to play with radark yet, I don't think the anonymization will matter to it. As I understand it, it selects dark IPs by doing a prescan for IPs that don't respond to anything in the time period and then reprocesses the flows looking at traffic to the dark IPs on the assumption they are attacks. I don't think anonymization will affect that and you should be able to feed information back to the TX about what kind of attacks they are seeing and if they look to be having success (i.e. an attacking IP making a successful connection to a real host and doing a fair amount of traffic). It may motivate the TX to install argus for themselves with non anonmized data to figure out who the attacked hosts are if the cost looks high enough, or it may reassure them that their security measures are keeping the noise at an acceptable level (thats essentially what my argus traffic scripts do for us on our campus). Either of those things could be of value to them and an encouragement to let you keep capturing traffic from them. We have a data mining research project here that produced such interesting results from the anonymized data that the agency owning the data created a test system to run against the real data to extract and action against the real data and is now talking about installing a parallel system using non anonymized data beside our anonymized one (that only their people have access to) to be able to always do that now that the value has been demonstrated (which was part of the reason for the original project). Peter Van Epp / Operations and Technical Support Simon Fraser University, Burnaby, B.C. Canada
Peter,

Very interesting.  Currently we are only monitoring partial traffic and a few organizations using the TX have other providers so there is asymmetric routing.  Then for some subnets we only have unidirectional traffic, so that might affect the results.  Though I think I will still play around with this tool.  If I am still able to find some interesting attack traffic, the TX might appreciate that information.  Giving the TX some benefits for us being there would be a good thing. :)

Nick
Peter Van Epp | 3 Mar 21:30
Picon
Picon
Favicon
Gravatar

Re: racluster() memory control Re: New To Argus

> >  
> Peter,
> 
> Very interesting.  Currently we are only monitoring partial traffic and 
> a few organizations using the TX have other providers so there is 
> asymmetric routing.  Then for some subnets we only have unidirectional 
> traffic, so that might affect the results.  Though I think I will still 
> play around with this tool.  If I am still able to find some interesting 
> attack traffic, the TX might appreciate that information.  Giving the TX 
> some benefits for us being there would be a good thing. :)
> 
> Nick

	Thats another good use of argus :-). I sometimes find asymetric routes
due to policy (i.e. CA*net4 accepts the route because its in the RR but a wrong
BGP filter somewhere in the path sends in back commodity). This shows up in
the commodity traffic report like this:

> >
> >     but comes back in commodity:
> >     
> >199.60.1.4           8,349,227,827 Tot     2,144,505,502 Out
> >6,204,722,325 In
> >
> >  128.252.252.48             6,115,420,960                  0
> >  6,115,420,960
> >  128.252.252.48:22          6,115,420,960                  0
> >  6,115,420,960
> > 

	This was someone in our CS department heading somewhere on I2 and 
being bitten by a BGP filter, the out is 0 because it is on our C4 link
which is clear channel gig but the return is coming in commodity (130 megs,
saturated and packet shaped) which is both bandwidth restricted and costs 
money. Reporting this up the line usually gets the filter corrected and 
everybody wins (except the gigapops that have to correct the filter of
course :-)) as the user gets better throughput and we (who pay the bandwidth 
bill) get to waste the bandwidth saved in more P2P :-). In the case of 
multiple links, something like argus ids would be needed to figure out the 
source link (I have different argi on each link so its easy for me). The 
traffic numbers (currently the top 30 bandwidth users in a day) are used to 
decide when this is worth doing because it gets fairly labour intensive for
the various gigapops in the path to figure out who has the bad BGP filter
and you only want to do it when there is a reasonable amount of bandwidth 
involved.

Peter Van Epp / Operations and Technical Support 
Simon Fraser University, Burnaby, B.C. Canada

Stewart Gray | 3 Mar 21:43
Picon

Top talkers on particular service

Hey Guys,
 
A simply question im sure. How do you get a list of top talkers for a particular service. In real terms, I'm seeing a large spike in https traffic and I'd like to know who is generating the traffic. I've played with 'ramon -M Matrix' but I'm only interested in the src addresses initially. Once i've determine the top talker it'd be good to drill it down to find what it's talking to.
 
Have you considering putting an argus cheat sheet of sorts on your page? It could cover a bunch of argus tool usage examples. It'd be useful for these sorts of queries :)
 
Thanks,
 
Stew
#####################################################################################
Important: This electronic message and attachments (if any) are confidential and may be legally privileged. If you are not the intended recipient do not copy, disclose or use the contents in any way. Please let us know by return e-mail immediately and then destroy this message.
#####################################################################################
Favicon

Re: Top talkers on particular service

Stew,

You could try the following.

racluster -r argus.* -M rmon -m saddr  -w - - port https | rasort -m 
bytes -w - | ra -N 20 -s saddr trans:10 sbytes:14 dbytes:14 bytes:14

Best regards,

Pablo J. Rebollo

Stewart Gray wrote:
> Hey Guys,
>  
> A simply question im sure. How do you get a list of top talkers for a 
> particular service. In real terms, I'm seeing a large spike in https 
> traffic and I'd like to know who is generating the traffic. I've played 
> with 'ramon -M Matrix' but I'm only interested in the src addresses 
> initially. Once i've determine the top talker it'd be good to drill it 
> down to find what it's talking to.
>  
> Have you considering putting an argus cheat sheet of sorts on your page? 
> It could cover a bunch of argus tool usage examples. It'd be useful for 
> these sorts of queries :)
>  
> Thanks,
>  
> Stew
> #####################################################################################
> Important: This electronic message and attachments (if any) are 
> confidential and may be legally privileged. If you are not the intended 
> recipient do not copy, disclose or use the contents in any way. Please 
> let us know by return e-mail immediately and then destroy this message.
> #####################################################################################

ScottO | 3 Mar 21:56
Picon

Re: Top talkers on particular service

Hi Stew,

I do something similar that you could modify.  First I process the file through racluster via saddr:

/usr/local/bin/racluster -r $directory/$argus_file -M rmon -m saddr -w /tmp/a1temp_cluster.out - ip

Then I take the resulting file and do various things with it, one being just tallying up total traffic bytes:

/usr/local/bin/rasort -r /tmp/a1temp_cluster.out -m bytes -w - - net '$home_net' | /usr/local/bin/ra -N 20 -s saddr bytes:14 sbytes:14 dbytes:14

This above gives a nice list of top talkers, total traffic wise.  You could bpf the port(s) you want out of it.

Hope that helps,

Scott

On Mon, Mar 3, 2008 at 3:43 PM, Stewart Gray <Stewart.Gray <at> safecom.co.nz> wrote:
Hey Guys,
 
A simply question im sure. How do you get a list of top talkers for a particular service. In real terms, I'm seeing a large spike in https traffic and I'd like to know who is generating the traffic. I've played with 'ramon -M Matrix' but I'm only interested in the src addresses initially. Once i've determine the top talker it'd be good to drill it down to find what it's talking to.
 
Have you considering putting an argus cheat sheet of sorts on your page? It could cover a bunch of argus tool usage examples. It'd be useful for these sorts of queries :)
 
Thanks,
 
Stew
#####################################################################################
Important: This electronic message and attachments (if any) are confidential and may be legally privileged. If you are not the intended recipient do not copy, disclose or use the contents in any way. Please let us know by return e-mail immediately and then destroy this message.
#####################################################################################


Gmane