Charles M. Hannum | 1 Jan 2005 20:22

Re: TCP/Westwood+ support.

On Saturday 01 January 2005 14:49, Kentaro A. Kurahone wrote:
> I've implemented TCP/Westwood+ congestion control for NetBSD.  The authors
> claim that it deals better with high BDP lossy networks (like wireless).
> Since I'm using it to ssh to another box to write this e-mail, I'm fairly
> confident that I didn't break anything.
>
> Description: http://www-ictserv.poliba.it/mascolo/tcp%20westwood/homeW.htm
> Patch:
> http://www.sigusr1.org/~kurahone/tcp-westwood+-netbsd-2.99.11.diff.gz
>
> Feedback will be appriciated.

If I understand your code correctly, it makes two changes to the algorithm:

a) A running bandwidth estimate based on the ack rate is kept, and ssthresh is 
set according to that -- in other words, we do exponential growth up to the 
estimated bandwidth, and then linear growth thereafter, whereas Reno will 
just keep trying to increase ssthresh forever.

b) On a fast retransmit, the congestion window is initialized to ssthresh -- 
in other words, we always set to the linear growth point on a fast 
retransmit, and never do exponential growth again, except in the case of slow 
retransmit (or CWM).

Both of these behaviors are, on their face, more conservative than Reno.

My question is: what happens if you have rapid fluctuations; e.g. due to 
sharing a link with another system that is doing occasional short 
transactions that are not really congestion-controlled?  It appears to me 
that Westwood is fairly slow to react in decreasing ssthresh and switching to 
(Continue reading)

Charles M. Hannum | 1 Jan 2005 20:36

Re: TCP/Westwood+ support.

On Saturday 01 January 2005 19:22, Charles M. Hannum wrote:
> a) A running bandwidth estimate based on the ack rate is kept, and ssthresh
> is set according to that -- in other words, we do exponential growth up to
> the estimated bandwidth, and then linear growth thereafter, whereas Reno
> will just keep trying to increase ssthresh forever.

Actually, that's not quite right.  In Reno, ssthresh is set to half the 
current window (meaning the lower of the congestion window or how much data 
is outstanding at the time of the retransmit) -- so the Westwood behavior is 
potentially more aggressive.  This raises the question of whether it's too 
slow to adapt and might be *too* aggressive, hurting other traffic.  I don't 
see where this has been tested.

Charles M. Hannum | 1 Jan 2005 20:52

Re: TCP/Westwood+ support.

On Saturday 01 January 2005 19:22, Charles M. Hannum wrote:
> b) On a fast retransmit, the congestion window is initialized to ssthresh
> -- in other words, we always set to the linear growth point on a fast
> retransmit, and never do exponential growth again, except in the case of
> slow retransmit (or CWM).

Actually, I'm really confused by this code.  AFAICT, you're never actually 
doing the fast retransmit; the code that does it is after your "if 
(tcp_do_westwood_p) {... goto drop;}".  This would, I think, force you to be 
doing slow-retransmit all the time.

Kentaro A. Kurahone | 1 Jan 2005 21:04

Re: TCP/Westwood+ support.

On Sat, Jan 01, 2005 at 07:52:21PM +0000, Charles M. Hannum wrote:
> On Saturday 01 January 2005 19:22, Charles M. Hannum wrote:
> > b) On a fast retransmit, the congestion window is initialized to ssthresh
> > -- in other words, we always set to the linear growth point on a fast
> > retransmit, and never do exponential growth again, except in the case of
> > slow retransmit (or CWM).
> 
> Actually, I'm really confused by this code.  AFAICT, you're never actually 
> doing the fast retransmit; the code that does it is after your "if 
> (tcp_do_westwood_p) {... goto drop;}".  This would, I think, force you to be 
> doing slow-retransmit all the time.

Crud.  You're right.  Need to spend more time with Stevens' vol 2 I guess.

Thanks for all the feedback,

--

-- 
Kentaro A. Kurahone
SIGUSR1 Research and Development
"I am having a hallucination now, I don't need drugs for that." 

Kentaro A. Kurahone | 1 Jan 2005 20:48

Re: TCP/Westwood+ support.

On Sat, Jan 01, 2005 at 07:22:36PM +0000, Charles M. Hannum wrote:
[snip]
> Both of these behaviors are, on their face, more conservative than Reno.
> 
> My question is: what happens if you have rapid fluctuations; e.g. due to 
> sharing a link with another system that is doing occasional short 
> transactions that are not really congestion-controlled?  It appears to me 
> that Westwood is fairly slow to react in decreasing ssthresh and switching to 
> the more conservative linear growth.  This could be problematic in some 
> circumstances.

Fair enough.  Another thing that I noticed when I was implementing this is
that if the connection gets rerouted (longer rtt) the bandwith estimation code
will start to become inaccurate because the minMSS value that's being kept will
will be nolonger relevant.

> I also see a few problems with the implementation:
> 
> 1) In the fast-retransmit case, you are blindly setting cwnd to ssthresh; if 
> cwnd is already less than ssthresh, you should not do this.

That's what the paper describing the algorithm[0] said to do.

> 2) Your simple arithmetic in tcp_westwood_p_bwe() is potentially susceptible 
> to roundoff issues on low-bandwidth links, similar to the ones Brakmo and 
> Peterson complained about in the TCP Vegas paper (and that we fixed years 
> ago).  It's probably not as bad since you're dealing with byte counts rather 
> than packet counts, though.
> 
> 3) In tcp_westwood_p(), you are always setting dupacks to 0.  This is a bit 
(Continue reading)

Kentaro A. Kurahone | 1 Jan 2005 20:55

Re: TCP/Westwood+ support.

On Sat, Jan 01, 2005 at 07:36:42PM +0000, Charles M. Hannum wrote:
> On Saturday 01 January 2005 19:22, Charles M. Hannum wrote:
> > a) A running bandwidth estimate based on the ack rate is kept, and ssthresh
> > is set according to that -- in other words, we do exponential growth up to
> > the estimated bandwidth, and then linear growth thereafter, whereas Reno
> > will just keep trying to increase ssthresh forever.
> 
> Actually, that's not quite right.  In Reno, ssthresh is set to half the 
> current window (meaning the lower of the congestion window or how much data 
> is outstanding at the time of the retransmit) -- so the Westwood behavior is 
> potentially more aggressive.  This raises the question of whether it's too 
> slow to adapt and might be *too* aggressive, hurting other traffic.  I don't 
> see where this has been tested.

Theoretically the bandwidth estimation code is "accurate" enough that it won't
hurt other traffic.  The numbers that are in the research paper look good,
but they probably have no relation to what will happen in the real world.

Well, this was more of an exercise in relearning the BSD tcp/ip stack code.
Looked at this algorithm breifly when doing dayJob, and figured that it'll be
sufficiently intresting to implement, and potentially useful.  That being said,
I'm also poking at implementing the HighSpeed tcp RFCs, which would probably be
a lot more useful to people.

--

-- 
Kentaro A. Kurahone
SIGUSR1 Research and Development
"I am having a hallucination now, I don't need drugs for that." 

(Continue reading)

Hans Rosenfeld | 2 Jan 2005 15:42

EtherIP RFC 3378

Hello,

two weeks ago I ported the parts of RFC 3378 EtherIP support from
OpenBSD to NetBSD.

I've been using it for some weeks now without any problems, so if anyone
else wants to give it a try there is a diff against 2.0 in
http://www.headcrashers.org/comp/programs/netbsd-2-0-etherip.diff

This diff makes it possible to add a gif interfaces to a bridge, which
will then send and receive IP protocol 97 packets. Outbound packets are
ethernet frames with a EtherIP header prepended, inbound packets are
assumed to be the same, but the header is not checked for validity.

Also I left out stuff like the net.inet.etherip.allow sysctl that
OpenBSD has, I just wanted to make this work as simply as possible.

I hope this stuff is somewhat use, I've seen at least on request for this
in the archive of some other NetBSD list a while ago.

Hans

--

-- 
%SYSTEM-F-ANARCHISM, The operating system has been overthrown

john heasley | 3 Jan 2005 01:14

clear h/w csum flags in tcp_respond()

Unless I'm missing something, tcp_respond() always performs the TCP
checksum.  Thus, the mbuf pkthdr csum_flags ought to be cleared, since
csum_data is not set for use with h/w checksuming by tcp_respond() and
the checksum does not need to be done twice.

I noticed it when my test box came back after crashing.  The connections
that I had open before the crash were not reset.  Note, this is an hme(4)
with my changes to ip_output() to stuff the pseudo-hdr checksum for hme's
brand of h/w checksum.

does this look ok/reasonable?

Index: tcp_subr.c
===================================================================
RCS file: /cvsroot/src/sys/netinet/tcp_subr.c,v
retrieving revision 1.176
diff -u -r1.176 tcp_subr.c
--- tcp_subr.c	19 Dec 2004 06:42:24 -0000	1.176
+++ tcp_subr.c	3 Jan 2005 00:02:02 -0000
 <at>  <at>  -689,6 +689,9  <at>  <at> 
 			m_freem(m);
 			return EAFNOSUPPORT;
 		}
+		/* clear h/w csum data from rx packet */
+		m->m_pkthdr.csum_flags = 0;
+
 		if ((flags & TH_SYN) == 0 || sizeof(*th0) > (th0->th_off << 2))
 			tlen = sizeof(*th0);
 		else

(Continue reading)

Kentaro A. Kurahone | 3 Jan 2005 07:04

Re: TCP/Westwood+ support.

Hi,

I went back and reworked the code mainly to fix up the broken fast
retransmit.  I also took a stab at carving off the congestion control related
code into a set of plugable functions accessed via a struct of fnpointers.

It should make tweaking the congestion control code a tiny bit easier.

The math in the bandwidth estimation algorithm is still a bit iffy,
but this TCP variant is aimed at high BDP links, so it's probably ok.

Patch at:
http://www.sigusr1.org/~kurahone/tcp-westwood+-netbsd-2.99.11-r2.diff.gz

--

-- 
Kentaro A. Kurahone
SIGUSR1 Research and Development
"I am having a hallucination now, I don't need drugs for that." 

Jason Thorpe | 3 Jan 2005 08:48

Re: clear h/w csum flags in tcp_respond()


On Jan 2, 2005, at 4:14 PM, john heasley wrote:

> does this look ok/reasonable?

Yes, please check it in.

         -- Jason R. Thorpe <thorpej <at> shagadelic.org>


Gmane