Martijn van Oosterhout | 4 Mar 2010 17:34
Picon

Problem with byte-swapped IP addresses

Hi,

(argus 3.0.0, but it also happens with 3.0.3.2)

I'm having a problem with IP addresses being byte-swapped in the argus
output, like so:

03 Mar 10 00:30:16  e    f    tcp      70.20.168.192 *         ->
93.20.168.192 *             1       1514   INT
03 Mar 10 00:30:16  e    f    tcp      93.20.168.192 *         ->
192.168.20.70 *             1       1514   INT
03 Mar 10 00:30:17  e         tcp      70.20.168.192.1823      ?>
192.168.20.93.1307          1       1514   CON
03 Mar 10 00:30:21  e    f    tcp      12.20.168.192 *         ->
62.20.168.192 *             1       1514   INT

All the addresses on this network are 192.168.x.x, so the given
addresses are not possible. Other weird things:

- Argus often shows the fragment flag on, yet raw packet captures for
the same period show no fragments at all.
- When it happens, it is most commonly both source and dest, but
sometimes just one. In that case the source is much more commonly
byte-swapped.
- I have confirmed that they are byte-swapped in the argus data files,
so it's not a problem with ra. It's done wrong by the server.
- These byte-swapped addresses happen sporadically in the stream where
they occur:

03 Mar 10 00:30:07  M d       tcp      192.168.20.93.1307      ->
(Continue reading)

Peter Van Epp | 4 Mar 2010 19:59
Picon
Picon
Favicon

Re: Problem with byte-swapped IP addresses

On Thu, Mar 04, 2010 at 05:34:13PM +0100, Martijn van Oosterhout wrote:
> Hi,
> 
> (argus 3.0.0, but it also happens with 3.0.3.2)
> 
> I'm having a problem with IP addresses being byte-swapped in the argus
> output, like so:
> 
> 03 Mar 10 00:30:16  e    f    tcp      70.20.168.192 *         ->
> 93.20.168.192 *             1       1514   INT
> 03 Mar 10 00:30:16  e    f    tcp      93.20.168.192 *         ->
> 192.168.20.70 *             1       1514   INT
> 03 Mar 10 00:30:17  e         tcp      70.20.168.192.1823      ?>
> 192.168.20.93.1307          1       1514   CON
> 03 Mar 10 00:30:21  e    f    tcp      12.20.168.192 *         ->
> 62.20.168.192 *             1       1514   INT
> 
<snip>

	I assume this is an Intel (or other bigendian) machine? If so I'd
look at the hton macros as a possible source (although I don't immediately
see why they would change). If the data isn't changed from network to host 
order I think it will be reversed this way (but haven't actually looked at the
code). Given it seems to be at high load,there may be a bug (such as lack of 
hton macros) in the overload code (argus reduces what it is capturing when 
load gets to high). It may be profitable to try and capture the pcap input 
files that argus sees by setting ARGUS_PACKET_CAPTURE_FILE in your argus.conf 
file although if the pcaps look OK its more likely an argus bug somewhere I 
think. If you shouldn't be seeing any 70. addresses a print statement that 
dumps the PCAP record coded in to argus would shed some light on matters. 
(Continue reading)

Martijn van Oosterhout | 5 Mar 2010 09:31
Picon

Re: Problem with byte-swapped IP addresses

On Thu, Mar 4, 2010 at 7:59 PM, Peter Van Epp <vanepp <at> sfu.ca> wrote:
> On Thu, Mar 04, 2010 at 05:34:13PM +0100, Martijn van Oosterhout wrote:
>> Hi,
>>
>> (argus 3.0.0, but it also happens with 3.0.3.2)
>>
>> I'm having a problem with IP addresses being byte-swapped in the argus
>> output, like so:
>
>        I assume this is an Intel (or other bigendian) machine?

Yes, it's Intel.

> It may be profitable to try and capture the pcap input
> files that argus sees by setting ARGUS_PACKET_CAPTURE_FILE in your argus.conf
> file although if the pcaps look OK its more likely an argus bug somewhere I
> think.

Thanks! I didn't know argus had this feature, but it certainly
narrowed down the problem. Because the pcap file generated by argus
also has these byte-swapped packets.

# tcpdump -r ~/argus.dump host 98.20.168.192 -s 200 -n -e -v -XX
09:04:08.443801 00:06:5b:f4:fb:c7 > 00:06:5b:ed:4d:80, ethertype IPv4
(0x0800), length 1514: truncated-ip - 54825 bytes missing! (tos 0x0,
ttl 128, id 10994, offset 512, flags [none], proto TCP (6), length
56325, bad cksum 592c (->6e17)!) 18.20.168.192 > 98.20.168.192: tcp
        0x0000:  0006 5bed 4d80 0006 5bf4 fbc7 0800 4500  ..[.M...[.....E.
        0x0010:  dc05 2af2 0040 8006 592c 1214 a8c0 6214  ..*.. <at> ..Y,....b.
        0x0020:  a8c0 1347 0ca2 06b9 50d2 f1b5 3459 5010  ...G....P...4YP.
(Continue reading)

Carter Bullard | 5 Mar 2010 17:25

Re: Problem with byte-swapped IP addresses

Hey Martijn,
Sorry for the delayed response and sorry for the problems.
There have been only a few reports of this behavior, and it has
been very transient, and very difficult to track down, so I'm very
happy to have this information.  

If there is anything I can do to help, don't hesitate to holler!!!!

Carter

On Mar 5, 2010, at 3:31 AM, Martijn van Oosterhout wrote:

> On Thu, Mar 4, 2010 at 7:59 PM, Peter Van Epp <vanepp <at> sfu.ca> wrote:
>> On Thu, Mar 04, 2010 at 05:34:13PM +0100, Martijn van Oosterhout wrote:
>>> Hi,
>>> 
>>> (argus 3.0.0, but it also happens with 3.0.3.2)
>>> 
>>> I'm having a problem with IP addresses being byte-swapped in the argus
>>> output, like so:
>> 
>>        I assume this is an Intel (or other bigendian) machine?
> 
> Yes, it's Intel.
> 
>> It may be profitable to try and capture the pcap input
>> files that argus sees by setting ARGUS_PACKET_CAPTURE_FILE in your argus.conf
>> file although if the pcaps look OK its more likely an argus bug somewhere I
>> think.
> 
(Continue reading)

Peter Van Epp | 6 Mar 2010 04:19
Picon
Picon
Favicon

Re: Problem with byte-swapped IP addresses

On Fri, Mar 05, 2010 at 09:31:57AM +0100, Martijn van Oosterhout wrote:
> On Thu, Mar 4, 2010 at 7:59 PM, Peter Van Epp <vanepp <at> sfu.ca> wrote:
> > On Thu, Mar 04, 2010 at 05:34:13PM +0100, Martijn van Oosterhout wrote:
> >> Hi,
> >>
> >> (argus 3.0.0, but it also happens with 3.0.3.2)
> >>
> >> I'm having a problem with IP addresses being byte-swapped in the argus
> >> output, like so:
> >
> >        I assume this is an Intel (or other bigendian) machine?
> 
> Yes, it's Intel.
> 
> > It may be profitable to try and capture the pcap input
> > files that argus sees by setting ARGUS_PACKET_CAPTURE_FILE in your argus.conf
> > file although if the pcaps look OK its more likely an argus bug somewhere I
> > think.
> 
> Thanks! I didn't know argus had this feature, but it certainly
> narrowed down the problem. Because the pcap file generated by argus
> also has these byte-swapped packets.
> 
> # tcpdump -r ~/argus.dump host 98.20.168.192 -s 200 -n -e -v -XX
> 09:04:08.443801 00:06:5b:f4:fb:c7 > 00:06:5b:ed:4d:80, ethertype IPv4
> (0x0800), length 1514: truncated-ip - 54825 bytes missing! (tos 0x0,
> ttl 128, id 10994, offset 512, flags [none], proto TCP (6), length
> 56325, bad cksum 592c (->6e17)!) 18.20.168.192 > 98.20.168.192: tcp
>         0x0000:  0006 5bed 4d80 0006 5bf4 fbc7 0800 4500  ..[.M...[.....E.
>         0x0010:  dc05 2af2 0040 8006 592c 1214 a8c0 6214  ..*.. <at> ..Y,....b.
(Continue reading)

Martijn van Oosterhout | 7 Mar 2010 13:43
Picon

Re: Problem with byte-swapped IP addresses

On Sat, Mar 6, 2010 at 4:19 AM, Peter Van Epp <vanepp <at> sfu.ca> wrote:
>        While it would be nice to blame pcap (and pf-ring has its own version
> of pcap) I think its unlikely, as pcap doesn't care about line order it just
> copies the buffer it got from the interface. Argus is the one that cares about
> host order and thus does the htn type macros (or in this case perhaps doesn't
> :-). That said I would have thought the argus file output should just be a
> copy of the input buffer before any of the conversions are done. Can you get
> am independent capture (tcpdump, a sniffer, something like that) in parallel
> on the argus input interface so you could see what the wire thinks against
> what argus writes to the file?  If the dump from the wire is correct and the
> output from argus is wrong that narrows the search path considerably as the
> switch must be happening early in the path. As noted I still think this is
> most likely an argus bug of some kind although its getting odder all the
> time :-).

I think I have a parallel capture from tcpdump so tomorrow I'll try to
see if I can put them next to eachother.

I agree with that I don't see how the pcap library could be
responsible, since it plainly doesn't care about byte order, but in
the meantime I have confirmed that if you disable the ring buffer
(PCAP_FRAMES=0) the problem goes away (at least over the period we
tested, at the very least it hasn't happened yet).

One thought I had: does argus modify the data it receives from
libpcap? I don't think so but I thought I'd check. Using the
ring-buffer means that the data from PCAP is written to directly by
the kernel, whereas normally it's passed back via read(). In that case
perhaps there's a race condition where argus is modifying the original
data (byte-swapping) after the kernel has written a new packet there.
(Continue reading)

Carter Bullard | 7 Mar 2010 17:51

Re: Problem with byte-swapped IP addresses

Argus does modify the packet, but after it has called pcap_dump(),
so if the dump file contains the issue, I wouldn't think its argus's fault.
Unless of course pcap_dump() is somehow just scheduling the buffer
to be written, rather than copying it.   and dumps the buffer after argus
modifies it.

We do reference some variables frequently, so doing ntoh[ls]
in place saves us some cycles.

Do you notice that the flows that are messed up are of a particular
flow type?  arp, esp?

Carter

On Mar 7, 2010, at 7:43 AM, Martijn van Oosterhout wrote:

> On Sat, Mar 6, 2010 at 4:19 AM, Peter Van Epp <vanepp <at> sfu.ca> wrote:
>>        While it would be nice to blame pcap (and pf-ring has its own version
>> of pcap) I think its unlikely, as pcap doesn't care about line order it just
>> copies the buffer it got from the interface. Argus is the one that cares about
>> host order and thus does the htn type macros (or in this case perhaps doesn't
>> :-). That said I would have thought the argus file output should just be a
>> copy of the input buffer before any of the conversions are done. Can you get
>> am independent capture (tcpdump, a sniffer, something like that) in parallel
>> on the argus input interface so you could see what the wire thinks against
>> what argus writes to the file?  If the dump from the wire is correct and the
>> output from argus is wrong that narrows the search path considerably as the
>> switch must be happening early in the path. As noted I still think this is
>> most likely an argus bug of some kind although its getting odder all the
>> time :-).
(Continue reading)

Peter Van Epp | 8 Mar 2010 04:29
Picon
Picon
Favicon

Re: Problem with byte-swapped IP addresses

On Sun, Mar 07, 2010 at 01:43:44PM +0100, Martijn van Oosterhout wrote:
> On Sat, Mar 6, 2010 at 4:19 AM, Peter Van Epp <vanepp <at> sfu.ca> wrote:
> >        While it would be nice to blame pcap (and pf-ring has its own version
> > of pcap) I think its unlikely, as pcap doesn't care about line order it just
> > copies the buffer it got from the interface. Argus is the one that cares about
> > host order and thus does the htn type macros (or in this case perhaps doesn't
> > :-). That said I would have thought the argus file output should just be a
> > copy of the input buffer before any of the conversions are done. Can you get
> > am independent capture (tcpdump, a sniffer, something like that) in parallel
> > on the argus input interface so you could see what the wire thinks against
> > what argus writes to the file?  If the dump from the wire is correct and the
> > output from argus is wrong that narrows the search path considerably as the
> > switch must be happening early in the path. As noted I still think this is
> > most likely an argus bug of some kind although its getting odder all the
> > time :-).
> 
> I think I have a parallel capture from tcpdump so tomorrow I'll try to
> see if I can put them next to eachother.
> 
> I agree with that I don't see how the pcap library could be
> responsible, since it plainly doesn't care about byte order, but in
> the meantime I have confirmed that if you disable the ring buffer
> (PCAP_FRAMES=0) the problem goes away (at least over the period we
> tested, at the very least it hasn't happened yet).
> 
> One thought I had: does argus modify the data it receives from
> libpcap? I don't think so but I thought I'd check. Using the
> ring-buffer means that the data from PCAP is written to directly by
> the kernel, whereas normally it's passed back via read(). In that case
> perhaps there's a race condition where argus is modifying the original
(Continue reading)

Carter Bullard | 8 Mar 2010 15:44

Re: Problem with byte-swapped IP addresses

The weirdest scenario that I can think of, is that under high load, the pf-ring
could be passing up the same packet twice.  we modify it, on the first pass, and
so we get a modified packet as a new packet, the second time.

Is this possible?

Carter

On Mar 7, 2010, at 10:29 PM, Peter Van Epp wrote:

> On Sun, Mar 07, 2010 at 01:43:44PM +0100, Martijn van Oosterhout wrote:
>> On Sat, Mar 6, 2010 at 4:19 AM, Peter Van Epp <vanepp <at> sfu.ca> wrote:
>>>        While it would be nice to blame pcap (and pf-ring has its own version
>>> of pcap) I think its unlikely, as pcap doesn't care about line order it just
>>> copies the buffer it got from the interface. Argus is the one that cares about
>>> host order and thus does the htn type macros (or in this case perhaps doesn't
>>> :-). That said I would have thought the argus file output should just be a
>>> copy of the input buffer before any of the conversions are done. Can you get
>>> am independent capture (tcpdump, a sniffer, something like that) in parallel
>>> on the argus input interface so you could see what the wire thinks against
>>> what argus writes to the file?  If the dump from the wire is correct and the
>>> output from argus is wrong that narrows the search path considerably as the
>>> switch must be happening early in the path. As noted I still think this is
>>> most likely an argus bug of some kind although its getting odder all the
>>> time :-).
>> 
>> I think I have a parallel capture from tcpdump so tomorrow I'll try to
>> see if I can put them next to eachother.
>> 
>> I agree with that I don't see how the pcap library could be
(Continue reading)

Martijn van Oosterhout | 8 Mar 2010 16:03
Picon

Re: Problem with byte-swapped IP addresses

On Mon, Mar 8, 2010 at 4:29 AM, Peter Van Epp <vanepp <at> sfu.ca> wrote:
>        I suspect we will find this is high load related in which case turning
> off pf-ring likely causes enough packet loss that the load doesn't tickle
> the bug :-). Have you given pf-ring lots of memory for buffers? As I recall
> (its been a number of years since I last played with pf-ring) we had to boost
> the kernel VM space limit to be able to get a couple of megs of buffer space
> for pf-ring although that may have been a 2.4 series kernel rather than 2.6.

That's an idea. Unfortunately I don't see a simple way to determine if
argus is dropping many packets or not. We've configured argus to have
a ring-buffer of 16384 packets. You can make it bigger, but if argus
isn't keeping up then it doesn't really matter how big you make it.
argus isn't using 100% CPU, but maybe that's a lie.

In any case, I have a comparison between tcpdump and argus. Here is
the output that argus dumped:

14:22:43.610709 IP 192.168.20.2.139 > 192.168.7.168.1614: .
2854318090:2854319550(1460) ack 1379121316 win 64938
14:22:43.610744 IP 192.168.15.23.1126 > 192.168.20.121.139: P
2087784273:2087784391(118) ack 2513133511 win 65391
14:22:43.610762 IP 192.168.7.168.1614 > 192.168.20.2.139: . ack
4294954156 win 17520
14:22:43.610790 IP 192.168.20.2.139 > 192.168.7.168.1614: .
1460:2920(1460) ack 1 win 64938
14:22:43.702911 00:50:56:93:2a:15 Null > 00:0f:1f:66:11:bc Unknown
DSAP 0x44 Information, send seq 85, rcv seq 0, Flags [Command], length
170
14:22:43.702936 IP truncated-ip - 54825 bytes missing! 43.20.168.192 >
46.20.168.192: tcp
(Continue reading)


Gmane