Carter Bullard | 9 Feb 17:02

argus-3.0.3 on the server, new ARGUS_EVENT feature

Gentle people,
argus-3.0.3 is now an official development thread for argus.  It will become argus-3.0.4
when it is stable and released.  I'm going to try to avoid the endless alpha, beta
numbers, and just have 3.0.3.x until we have 3.0.4.

OK, argus-3.0.3.1 has many bug fixes, as reported on the list.  If I have provided patches,
they are in this release.  You can use these packages in production, but new features may
have bugs, so consider the odd releases as unstable, experimental whatever, but they
should work.

New features are now coming out, based on the list I put on the web site.  This has
Peter's argusarchive() programs and documentation, as an example.  And argus has
gotten some tiny performance enhancements, but no major architectural changes.

argus-3.0.3 will always be backward compatible with argus-3.x, so you can use
the new clients in your existing infrastructure.  argus-3.0.3 server may generate
data that argus-3.0.2 clients may not like, so be aware that upgrading should
be thoughtful.  With argus-3.0.3,  new features will be coming really fast,
so consider this a very dynamic set of software.

In this first wave of new features, argus gets the "Argus Events" functions, which allows
you to add additional data sources to your argus data stream.    The idea is that we want to
correlate lots of different data with network events to solve problems.  The easiest way to
do this, is to inject non-flow data into the flow data stream, so that as you collect, process
and archive argus data, supporting non-flow data is available.  Using this strategy,
the observation point for the non-flow data is the argus probe, so the data has the same
source identifiers and sync'd timestamps.  If you build archives, then the argus event
data is stored along with the flow data, so you can retrieve relevant data though local
scoping rules.

Argus can be configured to run programs/scripts or read files, such as those in /proc,
periodically, as specified in its argus.conf file.   Argus runs the ArgusEvent generator
as a thread, using 

It will take the output it receives as a buffer, optionally compress it (only if it reduces
the size of the output) and then form an ArgusEvent record, which is composed of
an ArgusRecordHeader,  an ArgusTransport DSR, so we can know who sent it,
a ArgusTime DSR, which has the starttime and endtime for running the program, and
a string that describes what the script is/was, and the buffer is placed in an ArgusData
DSR.  Argus then inserts the record into the argus output stream.

All the client programs are aware of ARGUS_EVENTs, and try to do the right thing with
them.  There is some work that needs to be done here.  A program called raevent() is
provided in the client distribution that prints the ARGUS_EVENT.  All other clients
basically ignore them at the moment.

The new argus configuration variables look like:
   ARGUS_EVENT_DATA="prog:/usr/local/bin/rasnmp:1m:compress"

The ./support/Config/argus.conf file has examples.

The initial example programs are rasnmp(), ravms(), and ralsof(), which can be found in
./bin of the argus distribution.  The concepts are that you can generate local stats, or
you can use argus to collect remote stats from other devices.   The examples show that
you can provide remote SNMP data, collected on whatever interval you want, local host
performance data, and user/process to flow data mappings (using lsof) so you can
attribute network flows to programs and users.  The examples are just bash or perl scripts
that generate ascii output structured as XML.  The XML schema could be improved upon,
for those interested.

The rasnmp() script is designed to gather the arp tables and counters from specific interfaces on
a switch in my enterprise.  It generates this type of output:

2010/02/09.10:03:34.379890:srcid=192.168.0.68:prog:/usr/local/bin/rasnmp
<ArgusEvent>
   <ArgusEventData Type = "SNMP Stats: /usr/bin/snmpwalk -Os -c xxxxxx -v 2c 192.168.0.1" >
      < Label = "ipNetToMediaPhysAddress.2.207.237.36.97 " Value = " STRING: 0:21:a0:ce:c:5" />
      < Label = "ipNetToMediaPhysAddress.3.192.168.0.3 " Value = " STRING: 0:f:b5:df:17:f8" />
      < Label = "ipNetToMediaPhysAddress.3.192.168.0.66 " Value = " STRING: 0:16:cb:ad:90:11" />
      < Label = "ipNetToMediaPhysAddress.3.192.168.0.68 " Value = " STRING: 0:23:32:2f:ac:9c" />
      < Label = "ipNetToMediaPhysAddress.3.192.168.0.164 " Value = " STRING: 0:b:db:5c:e5:7c" />
      < Label = "ipNetToMediaPhysAddress.3.192.168.0.202 " Value = " STRING: 0:9:5b:36:a:33" />
      < Label = "ipNetToMediaPhysAddress.9.10.7.219.56 " Value = " STRING: 0:19:d1:46:d3:16" />
   </ArgusEventData>
   <ArgusEventData Type = "SNMP Stats: /usr/bin/snmpget -Os -c xxxxxx -v 2c 192.168.0.1" >
      < Label = "ifInUcastPkts.2 " Value = " Counter32: 27505427" />
      < Label = "ifOutUcastPkts.2 " Value = " Counter32: 17997683" />
      < Label = "ifInOctets.2 " Value = " Counter32: 820077576" />
      < Label = "ifOutOctets.2 " Value = " Counter32: 3444316041" />
      < Label = "ifOutDiscards.2 " Value = " Counter32: 0" />
      < Label = "ifInUcastPkts.3 " Value = " Counter32: 20118859" />
      < Label = "ifOutUcastPkts.3 " Value = " Counter32: 23807470" />
      < Label = "ifInOctets.3 " Value = " Counter32: 3305162556" />
      < Label = "ifOutOctets.3 " Value = " Counter32: 3620106826" />
      < Label = "ifOutDiscards.3 " Value = " Counter32: 0" />
      < Label = "ifInUcastPkts.9 " Value = " Counter32: 848091" />
      < Label = "ifOutUcastPkts.9 " Value = " Counter32: 2618476" />
      < Label = "ifInOctets.9 " Value = " Counter32: 242767620" />
      < Label = "ifOutOctets.9 " Value = " Counter32: 2048906796" />
      < Label = "ifOutDiscards.9 " Value = " Counter32: 0" />
   </ArgusEventData>
</ArgusEvent>

Argus can now generate the data, it is up to us to write clients to parse the data and do good things
with it, such as graph it, correlate it, push it into mysql or rrd's, or just store it. 

There are a lot of candidate ArgusEvents generators, such as syslog() message collectors, temperature
sensors, multicore resource availability metri generators, etc.....   I hope that we create a lot of them.

Hope all is most excellent, and that lots of sites find this new feature useful!!!!!
Comments always more than welcome.

Carter

Attachment (smime.p7s): application/pkcs7-signature, 3815 bytes
pengiran | 9 Feb 16:22
Picon

how to filter arp, llc, loop, ospf.

Hi all,

i want to record traffic for a period of time. currently i manage to have 4 sensor and 1 database server.all the traffic been collected and inserted into the databse by rasqlinsert.

i want to filter the traffic with the proto = arp, llc, loop ,ospf.

i know we can use "- ip proto not icmp " and "argus.out "not icmp" as filter. when i try to change the protocol to "ospf", argus run smoothly and read using ra doesnt show any ospf record. but when i try to change to llc, loop. argus simply did not start (check /var/run and using "ps aux | grep argus").


please guide me.

Thanks

Regards,
Peng

Phillip G Deneault | 4 Feb 04:09
Favicon

rafilteraddr issue

Hello all,

I'm attempting to use rafilteraddr and I must be using it wrong, but there 
isn't any authorative documentation on it.  I'm using argus-clients-3.0.2 
from http://qosient.com/argus/dev/ from the tarball dated 1/26/10.

Right now I'm just attemping to take a file and filter it to get a smaller 
subset of records.  My source file has only a handful of records and 
contains my targeted IP.

I'm running:
rafilteraddr -f filtertest.txt -r /data/argusinput -w /data/argusoutput

with a file containing my one target address.  If I try this command with 
the one line '192.168.1.1' or '192.168.1.1/32', I get the records I 
expect.

If I try '192.168.1.0/24', I get no records back at all that I should.

If I use -vf to invert my results, I get similar behavior.  Filters using 
the /24 are ignored, but entries with the /32 are processed correctly.

If I put more than one record in my filter list, mixing /24s and /32s, the 
/24 records are ignored and the /32s are processed correctly.

Could something be parsing the file wrong?  or am I doing something wrong?

Thanks,
Phil

pengiran Awang | 30 Jan 04:58
Picon

racluster error when using -m none

hello..

I wanted to turn off the aggregation function so that i will get the data store with out aggregaton.

but when i'm using the "-m none" option, i got this error.

shell> racluster -r argus-id1 -m none
racluster[13247]: 11:47:16.739104 ArgusClientInit: ArgusNewAggregator error


when using racluster without "-m none", i got the output on my screen.

please advice.

regards,
peng

Carter Bullard | 28 Jan 18:28

Re: question regarding argus-client.

Hey Pengiran,
You will have to send email to the argus mailing list for me to respond
to them.  And try to keep your questions to one at a time.  I'm forwarding
this to the list, so that it gets in the archive.

Please send pdf's rather than word files.  While I don't trust either of these
format's, if you need to send a diagram, pdf's are better for me.

If you have errors running a program, please send any error messages
along with the command line options, so I can figure out what the problem
may be.

For your situation where you want to populate a MySQL database table with
the primitive argus data from 4 remote argus sensors,  you will want 
to use radium() to collect the records from the 4 sensors, and a single
rasqlinsert() to read the combined stream of argus records and write them
to the database table you specify.  You will want to make sure that the "srcid"
field is in the list of print fields, so that rasqlinsert() will create a column with
the argus source id, so you can pick and choose the records you're interested
in.

radium() uses a /etc/radium.conf configuration file that you will create, using
the sample provided in ./support/Config/radium.conf.    Create 4
RADIUM_ARGUS_SERVER="" lines with the addresses of the
4 argus sources, and set  the RADIUM_ACCESS_PORT to a number
you like. I use 561, but for this example lets call it XXXX.

This will collect the data from the 4 sensors, and give you a single point
to access all the data.  Use ra() as a test, to attach to your radium() and
see the traffic that it is collecting.

   ra -S localhost:XXXX -s +1srcid

The fields that ra() prints, will be the fields that are used to define the database
schema.  You don't want a large number of fields, just the ones that will be useful
for you.  autoid, stime, srcid, saddr, daddr, proto, sport, dport, pkts, and bytes are
a good start, but you will want to modify that.  The binary record is inserted into the database
so, all the other information is stored, but its not "exposed" to MySQL.

You will run rasqlinsert() so that it attaches to radium().

   rasqlinsert -m none -S localhost:XXXX -s +0autoid +1srcid -M time 1d \

This will write records into a daily table that has the date in its name.
Using mysql() check the schema that rasqlinsert() created, and add fields using
the "-s " option as needed.  Be sure and drop any tables from the data that may
be affected, if you change the schema.

Using the mysql() program, print out the current schema for the table that you
are writing into.

% mysql -u user 
mysql> use argus
mysql> describe argusTable_2010_01_28
+--------+-----------------------+------+-----+---------+-------+
| Field  | Type                  | Null | Key | Default | Extra |
+--------+-----------------------+------+-----+---------+-------+
| ltime  | double(18,6) unsigned | NO   |     | NULL    |       | 
| dur    | double(18,6)          | NO   |     | NULL    |       | 
| srcid  | varchar(64)           | NO   | PRI |         |       | 
| saddr  | varchar(64)           | NO   | PRI | NULL    |       | 
| daddr  | varchar(64)           | NO   | PRI | NULL    |       | 
| bytes  | bigint(20)            | YES  |     | NULL    |       | 
| record | blob                  | YES  |     | NULL    |       | 
+--------+-----------------------+------+-----+---------+-------+
7 rows in set (0.03 sec)

The "record" field holds the argus record.



Carter

On Jan 28, 2010, at 11:50 AM, pengiran Awang wrote:

Hai Carter,

Thank you for helping me. i manage to write the data into the MySQL database, but unfortunately i face new challenge when i trying to dealing with multiple sensor.

with this email i attach some basic overview on what i trying to archive for my project.

Need your advice and guidance.

Regards,
Peng   

On Sat, Jan 23, 2010 at 4:31 AM, Carter Bullard <carter <at> qosient.com> wrote:
Hey Peng,
I forgot to mention that you should read the database page on the argus web site.


It may answer some of your questions.

Carter

On Jan 22, 2010, at 12:18 AM, pengiran Awang wrote:

Hai Carter,

i a student in local university in malaysia and i just start using argus.

want to ask for your help and suggestion.

currently im building 4 argus sensor and 1 database server (MySQL).

i read the mailing list
http://thread.gmane.org/gmane.network.argus/6953/focus=6964
and i manage to insert the data using rasqlinsert() using this command.

argus -r <tcpdump.out> -w - | rasqlinsert -r - -w mysql://user <at> host/db/argusTable

will this command insert directly the new traffic recorded in the tcpdump file to the same tabel (argusTable).


i'm try to create only one table that will record all the traffic from the 4 sensor...is there is any possiblility to insert the data from multiple sensor to the same table (argusTable) without to creating new table in the database?

can the "record" field be use to tell that this record come from what sensor?

Sorry for asking you such a basic question. i try to go through the mailing list for 4 days now and i just get lost in it =(.

Regards,
peng




<Question_To_Carter.doc>

Carter Bullard
CEO/President
QoSient, LLC
150 E 57th Street Suite 12D
New York, New York  10022

+1 212 588-9133 Phone
+1 212 588-9134 Fax



Attachment (smime.p7s): application/pkcs7-signature, 3815 bytes
Carter Bullard | 25 Jan 18:46

Re: argus-3.x request (forwarded)

Hey Martin,
You should be using argus-3.0!!!  Or at least be playing with it ;o)
Could you resend this response to the argus mailing list, or can I?

So some clarification on your 2 items.

  1.  Duplicate packets
           So this is an interesting problem because there is very little information in
           the packet to help you differentiate between duplicates and real retransmissions.
           They aren't actually always identical packets, as the L2 information or TTLs maybe
           off, as the mirror may not be on the first hop link/interface.

           There are a few ideas.  The best would be to reject packets with different L2/tunnel
           id's but identical IP (ignoring the TTL) and transport identifiers (especially any sequence
           numbers) that arrive less than one RTT for the flow.  The RTT can be determined,
           when possible, and we can come up with reasonable default values (< 100uSec).

           Currently, argus only has cached information from the last packet  seen in either
           direction.  So if the packet train is something like this:

               1, 1, 2, 2, 3, 3, 4, 4

          we can figure it out.  But if its something like this:
               1, 2, 3, 1, 2, 3,  4, 5, 6, 4, 5, 6

          then a simple strategy is not going to do it for us.

          Programs like editcap() attempt to remove duplicates by keeping an MD5 cksum of the
          last 4 packets in a cache and rejecting matches.  This is doable, but it also is simple
          and would have some issues.

          I suspect we can do something like the two above.  Keep a hash with the timestamp, 
          on a per flow basis, look at the L2 info and keep a hashes in a queue for awhile, to
          identify duplicates, and account for them in an additional counter.

          The trick will be to inspect a bunch of packet files that capture this situation and check
          to see how best to identify the duplicates.  I can start working on this problem now, if
          we have the files.

  2.  DNS transaction capture

          You can do this today with argus.  A user data buffer capture of 256 will capture all
           the data needed to do what you want with DNS, and the program radump() will
           printout all the DNS information you need to do the tracking.  The problem is that
           with this strategy you capture 256 bytes of every flow, and that maybe an issue for
           some sites.

          Item #3 of work items for 2010, mentions control plane flow monitoring.
          DNS is THE internet Call Control protocol.  (see slides 35 and 36 of the FloCon
          argus 1/2 day tutorial, Introduction to Argus).

           So, we'll have specific support for DNS tracking in argus-3.0.4.  What this
          really means is that we will capture all the payload data in the control plane flow,
          DNS included (also DHCP, ARP, STP, RIP, OSPF, ISIS, BGP, SIP, RSVP) , so you
          don't have to grab 256 bytes of every transaction to get the control plane flows to grab
          what you want.

With regard to your perfect world, I agree, and the approach is that those jobs (correlation
between flow information elements) are the jobs of argus clients and information systems.
If argus is doing the good job, then the data is captured, but external programs are
needed to track this information.  I do this with DHCP data, but where the IP address user
mappings come from is usually an information system outside the observation domain of
argus.

The thing that we are going to add in argus-3.0.4 (this work is completed by the way) is to have
end system argus event generators provide information like, "this process owns this flow",
and "this user owns this process".  So that if you instrument your end systems, you get explicit
information.

So, what do you think?

Carter

On Jan 25, 2010, at 9:01 AM, Martin xxxxxx  wrote:


Subject: [ARGUS] flocon 2010 presentations on the web
From: Carter Bullard <carter <at> qosient.com>
To: Argus <argus-info <at> lists.andrew.cmu.edu>
Date: Fri, 22 Jan 2010 14:00:43 -0500

Gentle people,
I've updated the argus home page and I've put a list of what I was going
to do for version 3.0.4.  If you have any ideas, I'd love to include them!!!

Hi Carter!

Two things I've been missing in my argus data:

1.
You already have:
            s      -  Src TCP packet retransmissions
            d      -  Dst TCP packet retransmissions
            *      -  Both Src and Dst TCP retransmissions

I would like argus to distinguish between retransmissions and duplicate copies of a frame.

Why, you ask?
Well, because it is very common that customers setup faulty SPAN mirroring. So the sensor (i.e. argus) receive two identical copies of a frame.
(In HP procurve switches, it is even "common" to have one copy of packets in one direction but two copies in the other...)

The problem is how the switches deal with "in", "out", "both" mirroring and VLAN-mirroring (opposite to port mirroring).


Right now the unwanted extra copies register as "retransmissions" even though no TCP retransmission has occurred.

I would like Argus to be able to distinguish between the two scenarios so it don't give false retransmission statistics and to help me spot customers with a faulty SPAN setup.



2.
I would like argus to store all DNS requests and/or responses (configurable).
This way I would have a database of requested hostnames which can be used to:
* match lookups against a database of known bad hostnames/strings
* afterwards be able to figure out the actual hostname of a web server without the payload from the GET request header (the "Host:" line).


(I currently use Argus 2.x, so if any of the above is already invented, I'm sorry to have wasted your time with this email :-)   )


/Martin



PS.
In a perfect world, I would like argus to be able to keep state of the "identity" behind IPs. I.e. argus should know how to decode specific protocols and look for data that might identify this IP (apart from the current IP and Mac).
Example:
From Windows NetBIOS packets you can get the hostname and MAC address of an IP (get MAC even if the sensor is not located on the same segment as the client).
From NetBIOS/SMB packets you can get usernames, this is usually nice information to have when trying to determine "who did the p2p filetransfer" or whatever.
From DNS responses you can get hostnames for IPs.
From DHCP/bootp you can get hostnames for IPs.

Apart from the vast work of developing all the protocol decoding needed, you would also need a smart way to store changes in Identification, and even harder - methods to query this information based on time.



Attachment (smime.p7s): application/pkcs7-signature, 3815 bytes
Carter Bullard | 25 Jan 18:44

argus 3.x request (forwarded)


Subject: [ARGUS] flocon 2010 presentations on the web
From: Carter Bullard <carter <at> qosient.com>
To: Argus <argus-info <at> lists.andrew.cmu.edu>
Date: Fri, 22 Jan 2010 14:00:43 -0500

Gentle people,
I've updated the argus home page and I've put a list of what I was going
to do for version 3.0.4.  If you have any ideas, I'd love to include them!!!

Hi Carter!

Two things I've been missing in my argus data:

1.
You already have:
             s      -  Src TCP packet retransmissions
             d      -  Dst TCP packet retransmissions
             *      -  Both Src and Dst TCP retransmissions

I would like argus to distinguish between retransmissions and duplicate copies of a frame.

Why, you ask?
Well, because it is very common that customers setup faulty SPAN mirroring. So the sensor (i.e. argus) receive two identical copies of a frame.
(In HP procurve switches, it is even "common" to have one copy of packets in one direction but two copies in the other...)

The problem is how the switches deal with "in", "out", "both" mirroring and VLAN-mirroring (opposite to port mirroring).


Right now the unwanted extra copies register as "retransmissions" even though no TCP retransmission has occurred.

I would like Argus to be able to distinguish between the two scenarios so it don't give false retransmission statistics and to help me spot customers with a faulty SPAN setup.



2.
I would like argus to store all DNS requests and/or responses (configurable).
This way I would have a database of requested hostnames which can be used to:
* match lookups against a database of known bad hostnames/strings
* afterwards be able to figure out the actual hostname of a web server without the payload from the GET request header (the "Host:" line).


(I currently use Argus 2.x, so if any of the above is already invented, I'm sorry to have wasted your time with this email :-)   )


/Martin



PS.
In a perfect world, I would like argus to be able to keep state of the "identity" behind IPs. I.e. argus should know how to decode specific protocols and look for data that might identify this IP (apart from the current IP and Mac).
Example:
From Windows NetBIOS packets you can get the hostname and MAC address of an IP (get MAC even if the sensor is not located on the same segment as the client).
From NetBIOS/SMB packets you can get usernames, this is usually nice information to have when trying to determine "who did the p2p filetransfer" or whatever.
From DNS responses you can get hostnames for IPs.
From DHCP/bootp you can get hostnames for IPs.

Apart from the vast work of developing all the protocol decoding needed, you would also need a smart way to store changes in Identification, and even harder - methods to query this information based on time.
Attachment (smime.p7s): application/pkcs7-signature, 3815 bytes
Carter Bullard | 22 Jan 20:00

flocon 2010 presentations on the web

Gentle people,
I've updated the argus home page and I've put a list of what I was going
to do for version 3.0.4.  If you have any ideas, I'd love to include them!!!

I also put a blurb about FloCon 2010 and there are links to the FloCon 2010
argus presentations.   The "Introduction to Argus" is 100 slides that talk
about a lot of stuff.  I'm hoping that it could be the start of something like
an O'Reilly Nutshell book on Argus.  

The "Data Fusion" presentation, is a description of a new concept in flow
based Situational Awareness, where we use differential correlation from
multiple probes in the same time domain to solve basic attribution and location
determination problems.  

Well, the slides don't use such complex words, but the above message is
buried in some discussion on Geo and NetSpatial information and flow data.

Please take a look, and if you have any opinions, I'd love to hear them.

Hope all is most excellent,

Carter

Attachment (smime.p7s): application/pkcs7-signature, 3815 bytes
Jason Carr | 18 Jan 20:29
Picon

10 minute time difference

I am running radium and writing to a file into a file location, /data/var/argus.out.  Every five minutes I
copy this file into a directory and rename the file to the current date and time.  While reading this file
with ra, it appears to be 10 minutes in the past.  For example, a file that is named
2010-01-18-10:25:00.argus contains data from 10:15-10:20.  Is there a reason why this would occur?  The
timestamps on the data source are synced with ntp as is the destination.

Thanks,

Jason

Sean McCreary | 15 Jan 03:37
Picon
Favicon

Two versions of argus-clients-3.0.2.tar.gz

<ftp://qosient.com/pub/argus/src/argus-clients-3.0.2.tar.gz> is not the
same as <http://qosient.com/argus/src/argus-clients-3.0.2.tar.gz>.
Which is the correct file?

Niall Murphy | 14 Jan 11:54
Picon
Favicon

ArgusReadSocketStream error

Hello all,

First of all, apologies for mailing a support related question to the development list, but i have searched
for the following error message which appears at the end of my Argus log, and cannot find references to it
anywhere, even searching koders.com.

"ArgusReadSocketStream: malformed argus record len 0"

I'm running...

ii  argus-client                      2.0.6.fixes.1-3          IP network transaction auditing tool
ii  argus-server                      1:2.0.6.fixes.1-16       IP network transaction auditing tool

...on debian stable.

If anyone could email me to let me know the possible triggers, impact, and workaround for this it would be
greatly appreciated.

If you need any more information please let me know.

Thanks.

--

-- 
Niall


Gmane