starlight | 1 Dec 2007 23:43
Favicon

[Openswan dev] bug report for 2.6.10-1 on OpenWRT Kamikaze 7.09: erratic ping, terrible performance

The bug tracking system wouldn't let me create an account,
so I gave up.

Running on OpenWRT on WRT54G v2.0 and WRT54GS v2.1

Just upgraded from White Russian 0.9 to Kamikaze 7.09.

Was running White Russian packaged OpenSWAN 2.4.8rc1, upgraded 
to OpenSWAN 2.4.10-1 on Kamikaze.

With the same exact 'ipsec.conf', ICMP echos times have suddenly 
become erratic.  On old version 'ping' gave solid and consistent 
values in the range of 14-17 milliseconds.  Now it looks like as 
staircase running from 15 to 100 milliseconds.

Also had installed OpenSWAN direct release 2.4.9-7 on White 
Russian router just prior to building the Kamikaze router.  Saw 
same problem.

Also getting terrible performance from a 'curl' HTTP file 
upload.  So bad I may go back to the original configuration. 
Maybe will try Kamikaze provided OpenSWAN build.

A sample 'ping' output (run from a CentOS 4.5 Athlon 4800+) 
follows.

$ ping vxp08
PING vxp08 (204.13.75.38) 56(84) bytes of data.
64 bytes from vxp08 (204.13.75.38): icmp_seq=0 ttl=125 time=75.9 ms
64 bytes from vxp08 (204.13.75.38): icmp_seq=1 ttl=125 time=42.2 ms
(Continue reading)

starlight | 2 Dec 2007 01:55
Favicon

[Openswan dev] WITHDRAWN: bug report for 2.6.10-1 on OpenWRT Kamikaze 7.09: erratic ping, terrible performance

Several more experiments, including pings to the VPN gateway 
with a bare-metal router with no OpenSWAN indicate that it's not 
OpenSWAN.  Looks like the network may have changed by 
coincidence at the time of the original upgrade.  Seems possible 
as Comcast had a network outage on the same path a couple of 
days back and problems have continued.

Sorry about the spurious bug report.

Decided to stick with running 2.4.10-1 for awhile.
starlight | 2 Dec 2007 21:23
Favicon

[Openswan dev] OpenSwan 2.6.10-1 on OpenWrt 7.09 consistently hangs on large HTTP file transfer

After approximately 6GB of a HTTP file transfer through OpenSWAN, 
it locks up suddenly and completely.  All remaining OpenWRT 
router functions continue to function normally.

Need some help on how to collect details for diagnosing the 
problem.  Everything looks normal, no error messages of any kind 
in the 'syslogd' log captured on a Linux server.  'dmesg' output 
from router is equally devoid of any diagnostic messages.
'ipsec whack --status' output (attached) looks fine to me.

Rebooting the router brings it back immediately.

Had this same issue with OpenWRT 0.9 running OpenSWAN
2.4.8 except the router would crash/reboot.
version	2.0

config setup
        interfaces=%defaultroute
        nat_traversal=no
        klipsdebug=none
        plutodebug=none

conn SouthEdge
        left=             %defaultroute
        leftnexthop=      %defaultroute
        right=            10.13.73.228
        rightsubnet=      10.13.75.38/32
        authby=           secret
        auto=             start
(Continue reading)

Paul Wouters | 2 Dec 2007 22:13
Gravatar

Re: [Openswan dev] OpenSwan 2.6.10-1 on OpenWrt 7.09 consistently hangs on large HTTP file transfer

On Sun, 2 Dec 2007, starlight <at> binnacle.cx wrote:

> After approximately 6GB of a HTTP file transfer through OpenSWAN,
> it locks up suddenly and completely.  All remaining OpenWRT
> router functions continue to function normally.
>
> Need some help on how to collect details for diagnosing the
> problem.  Everything looks normal, no error messages of any kind
> in the 'syslogd' log captured on a Linux server.  'dmesg' output
> from router is equally devoid of any diagnostic messages.
> 'ipsec whack --status' output (attached) looks fine to me.

Check if the machine is running out of memory?

Perhaps log the /proc/slabinfo contents every minute to
see if there is a memory issue.

Paul
starlight | 2 Dec 2007 22:20
Favicon

Re: [Openswan dev] OpenSwan 2.6.10-1 on OpenWrt 7.09 consistently hangs on large HTTP file transfer

Ok.  Running that now.  Should break in about a day.

At 04:13 PM 12/2/2007 -0500, Paul Wouters wrote:
>
>Check if the machine is running out of memory?
>
>Perhaps log the /proc/slabinfo contents every minute to
>see if there is a memory issue.
Michael Richardson | 2 Dec 2007 22:39
Picon

Re: [Openswan dev] OpenSwan 2.6.10-1 on OpenWrt 7.09 consistently hangs on large HTTP file transfer


>>>>> "starlight" == starlight  <starlight <at> binnacle.cx> writes:
    starlight> After approximately 6GB of a HTTP file transfer through OpenSWAN, 
    starlight> it locks up suddenly and completely.  All remaining OpenWRT 
    starlight> router functions continue to function normally.

    starlight> Need some help on how to collect details for diagnosing the 
    starlight> problem.  Everything looks normal, no error messages of any kind 
    starlight> in the 'syslogd' log captured on a Linux server.  'dmesg' output 
    starlight> from router is equally devoid of any diagnostic messages.
    starlight> 'ipsec whack --status' output (attached) looks fine to
    starlight> me.

  Also look at "ipsec spi" output.
  If you are transfering 6GB, you should have multiple rekeys.   It's
possible that you hit a bug with IPsec SAs going stale due to too much
data transfered.

--

-- 
]            Bear: "Me, I'm just the shape of a bear."          |  firewalls  [
]   Michael Richardson,    Xelerance Corporation, Ottawa, ON    |net architect[
] mcr <at> xelerance.com      http://www.sandelman.ottawa.on.ca/mcr/ |device driver[
] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [

starlight | 3 Dec 2007 01:47
Favicon

Re: [Openswan dev] OpenSwan 2.6.10-1 on OpenWrt 7.09 consistently hangs on large HTTP file transfer

At 04:39 PM 12/2/2007 -0500, Michael Richardson wrote:
>  Also look at "ipsec spi" output.

Ok, here's the logging line I'm running on the
router:

   while true; do
     echo; echo; date; echo
     ipsec spi; echo
     cat /proc/slabinfo; echo
     cat /proc/meminfo
     sleep 60
   done

The output is written to a file on a regular Linux system via

   ssh -l root router 2>&1 | tee router_log

When it blows up I'll extract the interesting
portion of the log and post it.
Michael Richardson | 3 Dec 2007 01:51
Picon

Re: [Openswan dev] OpenSwan 2.6.10-1 on OpenWrt 7.09 consistently hangs on large HTTP file transfer


>>>>> "starlight" == starlight  <starlight <at> binnacle.cx> writes:

    starlight> while true; do
    starlight> echo; echo; date; echo
    starlight> ipsec spi; echo
    starlight> cat /proc/slabinfo; echo
    starlight> cat /proc/meminfo
    starlight> sleep 60

  Pretty good.
  You may find http://www.sandelman.ca/software/ slabwatch-1.3.tgz useful.

--

-- 
]            Bear: "Me, I'm just the shape of a bear."          |  firewalls  [
]   Michael Richardson,    Xelerance Corporation, Ottawa, ON    |net architect[
] mcr <at> xelerance.com      http://www.sandelman.ottawa.on.ca/mcr/ |device driver[
] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [
starlight | 3 Dec 2007 07:23
Favicon

Re: [Openswan dev] OpenSwan 2.6.10-1 on OpenWrt 7.09 consistently hangs on large HTTP file transfer

At 04:39 PM 12/2/2007 -0500, Michael Richardson wrote:
>  Also look at "ipsec spi" output.

Adding this command to the 60 second loop causes
the OpenSwan and the transfer to hose up in a matter
of an hour or two, so I have removed it.  Will run
the command after the lock-up that results from
7GB of data transfer.

Tried stop/starting OpenSwan, but it did not restore
everything, so had to reboot the router.  Seems some
kernel corruption happens in this case.
starlight | 3 Dec 2007 16:27
Favicon

Re: [Openswan dev] OpenSwan 2.6.10-1 on OpenWrt 7.09 consistently hangs on large HTTP file transfer

At 01:23 AM 12/3/2007 -0500, you wrote:
>At 04:39 PM 12/2/2007 -0500, Michael Richardson wrote:
>>  Also look at "ipsec spi" output.
>
>Adding this command to the 60 second loop causes
>the OpenSwan and the transfer to hose up in a matter
>of an hour or two, so I have removed it.  Will run
>the command after the lock-up that results from
>7GB of data transfer.
>
>Tried stop/starting OpenSwan, but it did not restore
>everything, so had to reboot the router.  Seems some
>kernel corruption happens in this case.

Not so sure about this now.  Restarted transfer blew up within
an hour after restarting at 3:00am.  Now I'm wondering if the
Cisco VPN firewall was hosed by the transfers.  Looks like the
remote side was reset at 9:00am, eliminating the persistant
packet loss that had appeared coincident with the short-interval
failures.

Is this something anyone has encountered before?


 
_______________________________________________
Dev mailing list
Dev <at> openswan.org
http://lists.openswan.org/mailman/listinfo/dev

Gmane