Carter Bullard | 3 Nov 2011 23:44

argus[-clients]-3.0.6 release preparations

Gentle people,
I am beginning the release process for argus and argus-clients-3.0.6.
If there are bugs that you have that need attention, please send email ASAP.
The biggest one on the list at this time is a bug in rasplit() where we end up
with an occasional record in 1970 when spliting based on time.  I'm working this right now.

In order to make the clients distribution a bit more useful/grokable, we've
reorganizing the directory structure to highlight what would be considered
core utilities and examples.   Core programs are still in the clients directory, all
other programs have been moved to an examples directory, where each
example has its own space, so to speak.  Some big programs like ramysql, and
ratop will now be found in the examples area.  Ratemplate.c is now in that area.
If you are developing your own clients, using the techniques of TieT, as an example,
this reorganization should not affect you at all.  If it does cause problems, holler !!!!!!!

This new organization is in argus-clients-3.0.5.23.tar.gz, currently on the server.
     http://qosient.com/argus/dev/argus-clients-latest.tar.gz

If this sucks, or if you like it, we'd really like to hear from you, so don't be shy.
Release is looking like 1-2 weeks in Dec.

Again if there is anything that you feel like we've missed, don't hesitate to speak up.
Thanks for all the help !!!!!!!

Carter
Attachment (smime.p7s): application/pkcs7-signature, 4367 bytes
Carter Bullard | 4 Nov 2011 00:33

Re: Removing possibly unused metadata?

Hey Jason,
You should be able to remove the mac section, if you generated it, as that
doesn't generally have long term significance, assuming you realized that
the IP / mac pairing isn't anomalous.  You can generate the IP/mac report without
a massive hit, and then remove the mac DSRs.  You can remove the net DSR.
It contains the rtp and the tcp performance data that generally isn't of long term
interest, although it does have important forensics relevance if you're
concerned about TCP hijack detection or other TCP bad things.  But for many
situations, you can remove it.   The encaps DSR isn't big but it is 8 bytes
per record, and may not be a long lived one.

These would remove some significant bytes, and shouldn't be generally missed.

Carter

On Oct 28, 2011, at 5:06 PM, Jason Carr wrote:

> We write argus data into five minute chunked files.  We typically have +1G
> files for those 5 minutes.  Is there any metadata that we might be able to
> purge to decrease the size significantly?
> 
> I normally only care about StartTime, flags, pro to, src/dst
> {mac,ip,port}, direction, packets, bytes, state, and user data in either
> direction.
> 
> I already gzip compress the files, I tried using bzip2 on a few test files
> and got a 1.1G file down to 500M instead of 539M, but I'm looking for a
> larger compression and/or size difference.
> 
> Thanks,
(Continue reading)

MN | 4 Nov 2011 19:46
Picon
Favicon

Re: Removing possibly unused metadata?


Formerly, for data that we kept long-term, rounding time stamps to the
nearest 1/4 or 1/8 of a second reduced entropy sufficiently to make a
significant difference in compressed file sizes (this will not help on
non-compressed argus files).  I can send the old code if desired, but
it was for an older version of Argus.

Now we save our longer term data in ascii format, saving just the fields
that we want, and using a combination of -p and RA_TIME_FORMAT.

Consider using xz instead of bzip2, especially if you look at the log
files frequently, as the decompression time is significantly less - at
the cost of longer compression times.  Note xz defaults to '-6'.

We've been keeping more than a years worth of data on roughly ten 1-4g/s
links.

- mike

On Oct 28, 2011, at 5:06 PM, Jason Carr wrote:

> We write argus data into five minute chunked files.  We typically have +1G
> files for those 5 minutes.  Is there any metadata that we might be able to
> purge to decrease the size significantly?
> 
> I normally only care about StartTime, flags, pro to, src/dst
> {mac,ip,port}, direction, packets, bytes, state, and user data in either
> direction.
> 
> I already gzip compress the files, I tried using bzip2 on a few test files
(Continue reading)

Carter Bullard | 7 Nov 2011 18:35

FloCon 2012, be there

Gentle People,
I will be giving a 1/2 day tutorial at FloCon 2012 this year, entitled "From Packet to Alarm: Real Time
Situtational Awareness using Argus", which will trace how the Argus architecture and tools support this
type of awareness.  That should be on Mon Jan 9th, in Austin Tx.  I'm also going to give a presentation on
behavior monitoring metrics that are in the new argus-3.x source code, describing packet dynamics
sensing in Argus, and how to use those metrics in some interesting situations. 

We are also planning to hold at least one BOF on database support for large scale flow processing, so please
consider coming to Austin, Tx, Jan 9-12, 2012, to talk about argus.

Carter
Attachment (smime.p7s): application/pkcs7-signature, 4367 bytes
MN | 8 Nov 2011 00:10
Picon
Favicon

Re: Removing possibly unused metadata?


Hi Jason - 

These diffs are based on the 2.0.6 distribution, so are old.
They worked well for us over a several year period.  I believe
the ragator changes made about a 3% improvement.  The timestamp
and other rastrip changes made a variable difference depending
upon the mask, but were substantial.

Now, the asciification and xz compression - and lots more
storage - allow us to keep ~400 days.

Hope these help,
- mike

% diff ragator.c raGATOR.c
37a38,39
> int fromFilesOnly = 0;		/* -- MN */
> 
132a135,141
>       /* if we are not "real-time", then do not purge the queue as often -- MN */
>       if (rflag & !Sflag) {
> 	extern struct timeval RaClientTimeout;
> 	RaClientTimeout.tv_sec = 8;
> 	RaClientTimeout.tv_usec = 0;
>       }
> 
223a233
>    fprintf (stderr, "            -H bins[L]:range   Do Historgram-related processing (range is value-value, where
value is %%d[ums]) [UNDOC'ed]"); /* -- MN */
(Continue reading)

Carter Bullard | 8 Nov 2011 01:07

Re: Removing possibly unused metadata?

Hey Mike,
Many of these ragator() changes are now incorporated into 3.0 racluster(), so
shouldn't have to modify the source.  

For the rastrip(), looks like you want to blow away the fractional part of the
timestamp?   I'll can add that to rastrip() later this week, but that won't
reduce the size of the stored record.  How would you want to specify that
on rastrip()'s  command line?

Zeroing out a value is more in line with ranonymize(), rather than rastrip().
I'd suggest that you just strip out the ' net ' DSR to achieve what you're after ?

Carter

On Nov 7, 2011, at 6:10 PM, MN wrote:

> 
> Hi Jason - 
> 
> These diffs are based on the 2.0.6 distribution, so are old.
> They worked well for us over a several year period.  I believe
> the ragator changes made about a 3% improvement.  The timestamp
> and other rastrip changes made a variable difference depending
> upon the mask, but were substantial.
> 
> Now, the asciification and xz compression - and lots more
> storage - allow us to keep ~400 days.
> 
> Hope these help,
> - mike
(Continue reading)

Carter Bullard | 8 Nov 2011 01:08

Re: Removing possibly unused metadata?

Hey Mike,
Oh yes, and I forgot to mention that if you store the ascii version as a CSV, leaving the
column names at the top, raconvert() will be able to convert them back to binary, if
you want to process them later.

Carter

On Nov 7, 2011, at 6:10 PM, MN wrote:

> 
> Hi Jason - 
> 
> These diffs are based on the 2.0.6 distribution, so are old.
> They worked well for us over a several year period.  I believe
> the ragator changes made about a 3% improvement.  The timestamp
> and other rastrip changes made a variable difference depending
> upon the mask, but were substantial.
> 
> Now, the asciification and xz compression - and lots more
> storage - allow us to keep ~400 days.
> 
> Hope these help,
> - mike
> 
> 
> % diff ragator.c raGATOR.c
> 37a38,39
>> int fromFilesOnly = 0;		/* -- MN */
>> 
> 132a135,141
(Continue reading)

MN | 8 Nov 2011 01:31
Picon
Favicon

Re: Removing possibly unused metadata?


Hi Carter - 

> Many of these ragator() changes are now incorporated into 3.0 racluster(), so
> shouldn't have to modify the source.  

Agreed and thanks - I was just showing Jason what we had done as he'd
asked for the patches.

> For the rastrip(), looks like you want to blow away the fractional part of the
> timestamp?   I'll can add that to rastrip() later this week, but that won't
> reduce the size of the stored record.  How would you want to specify that
> on rastrip()'s  command line?

It will not reduce the size of the uncompressed record, but of the
compressed record it makes a _huge_ difference (lots of entropy is 
lost => much better compression).  

Now, for long term data (when we do not need precise time stamps),
we convert to ascii, round to Nths of a second and compress for
quite good compression.  We also create summay files (of seen IPs)
that drastically reduce needle-in-the-haystack searches.

Here's details for one hour today for one tap, with rounding to 3
decimal places of time:

# gunzip < argus.13.gz | wc
9395130 66723491 2688418924

# ls -al *.13*
(Continue reading)

Carter Bullard | 8 Nov 2011 11:47

Re: Removing possibly unused metadata?

Hey Mike,
I really like that you get all that semantic control using ascii, but in my heart I know there is a better way
:o). Since the compressed ascii isn't searchable, etc .....

So, how about summary indexing, so that you have a sense of what is in there?  I'd keep an IP address index, like
rahosts() output, if you like ascii, or a simple racluster() to get mac, IP, trans, and both pkts and bytes,
and poke that into a database.  That has worked really well for me.

Carter

On Nov 7, 2011, at 7:31 PM, MN <m.newton <at> stanford.edu> wrote:

> 
> Hi Carter - 
> 
>> Many of these ragator() changes are now incorporated into 3.0 racluster(), so
>> shouldn't have to modify the source.  
> 
> Agreed and thanks - I was just showing Jason what we had done as he'd
> asked for the patches.
> 
>> For the rastrip(), looks like you want to blow away the fractional part of the
>> timestamp?   I'll can add that to rastrip() later this week, but that won't
>> reduce the size of the stored record.  How would you want to specify that
>> on rastrip()'s  command line?
> 
> It will not reduce the size of the uncompressed record, but of the
> compressed record it makes a _huge_ difference (lots of entropy is 
> lost => much better compression).  
> 
(Continue reading)

Carter Bullard | 8 Nov 2011 14:02

Re: argus 3.0.3-20 Bivio patch

Hey Joel
Been a long time, hope all is most excellent, and that your'e still with Bivio.
A colleague pointed out that you guys left Argus off of your Network Flow Analysis Solution.
You seem to have a 2 page glossy on nProbe, nTop, YAF, SiLK and SANCP, but no Argus.
I'm very happy to be in the Application Library.  Any stats on Argus use?

Are you going to FloCon this year ?
Hope all is most excellent,

Carter

On Jan 21, 2011, at 4:52 PM, Joel Ebrahimi wrote:

I have been testing the latest development release on the Bivio platform.  There is an issue with one of the special devices we have from the Bivio interfaces. We have modified pcap in a number of way and one of this is to use the keyword ‘default’ which allows polling from all interfaces as a single logical interface. In the function ArgusGetInterfaceStatus there is an ioctl call that does not work with this interface. I modified the code to look for the keyword default and set the interface state. Its essentially the same thing that is done for the ‘dag’ card , only looking for ‘default’ now.

 

Patch is attached.

 

Cheers,

 

// Joel

 

Joel Ebrahimi

Solutions Architect

Bivio Networks Inc.

<argus-3.0.3.20.bivio.diff>

Attachment (smime.p7s): application/pkcs7-signature, 4367 bytes

Gmane