Re: best way to process data from multiple hosts?
Carter Bullard <carter <at> qosient.com>
2008-12-08 15:38:44 GMT
Hey Ken,
Try using these two programs, rasplit() and/or rastream().
Both programs split an incoming argus stream, and you can
set it up to split based on the argus data source id, which should
give you your separation.
Focusing on rasplit(), this is what I do on all my collection sinks.
rasplit -S radii:561 -M time 5m -w /path/to/archive/\$srcid/%Y/%m/
%d/argus.%Y.%m.%d.%H.%M.%S
This will take in a generic stream, and split the data into
an "argus source id" rooted, time based file structure, where
each file represents the data in a given 5 minute time span.
As time goes on, rasplit() creates new files, so you're archive
grows, as needed.
Rasplit() will break records across time boundaries, so that
the stats are preserved correctly when graphing, processing,
analyzing, whatever.
Because rasplit() can connect to up to, what is it, 64 remote sources,
and because they can all be argi, or radii, or a mix of the two, you
don't
have to worry about the collection tree structure so much.
So I recommend, at first, one radium() to connect to all your sources,
and one rasplit() to connect to your radium(). Where the radium
resides is not important, as long as you have resources. But having
radium() and rasplit() on the same machine has its advantages, in
(Continue reading)