Nivedita Singhvi | 1 May 02:09 2004
Picon

[PATCH 2.6.5] Re: Fw: Stack sends SYN/ACKs even though accept queue is full

Andrew Morton wrote:

>Begin forwarded message:
>
>Date: Thu, 29 Apr 2004 14:53:36 -0700
>From: Jan Olderdissen <jan <at> ixiacom.com>
>To: "'linux-kernel <at> vger.kernel.org'" <linux-kernel <at> vger.kernel.org>
>Subject: Stack sends SYN/ACKs even though accept queue is full
>
Attaching a patch which adds a sysctl to turn off this
behaviour.   Could you test this, please?  Patch against
2.6.5 vanilla kernel.  If you need a 2.4 version, let me
know.

>Because newly accepted connections are considered 'young', two such
>connections put on the synq will cause additional SYNs to be dropped until
>young connections age and additional connections are SYN/ACKed , etc. Since
>the initial TCP timeout is three seconds, you would expect two additional
>connections to be accepted every three seconds. However, experiments with
>2.4.25 show that number to be two connections every four seconds for unclear
>reasons.
>
Normally, I think the expected behaviour was that connections
would be short-lived. This is a reasonable expectation for most
web-servers etc.   In which case, the accept queue would free
up frequently, and having the syn request right there would save
a full timeout and round trip over the Internet. i.e. useful in the
common case.

I don't think it is worthwhile for environments where connections
(Continue reading)

Francois Romieu | 1 May 02:24 2004

[patch 1/7] 2.6.6-rc3-mm1 - r8169 napi


Napi for r8169 (Jon D Mason <jonmason <at> us.ibm.com>).
Both Tx and Rx processing are moved to the ->poll() function.

diff -puN drivers/net/r8169.c~r8169-napi drivers/net/r8169.c
--- linux-2.6.6-rc3/drivers/net/r8169.c~r8169-napi	2004-05-01 01:46:29.000000000 +0200
+++ linux-2.6.6-rc3-fr/drivers/net/r8169.c	2004-05-01 01:46:29.000000000 +0200
 <at>  <at>  -64,6 +64,14  <at>  <at>  VERSION 1.2	<2002/11/30>
 #define dprintk(fmt, args...)	do {} while (0)
 #endif /* RTL8169_DEBUG */

+#ifdef CONFIG_R8169_NAPI
+#define rtl8169_rx_skb			netif_receive_skb
+#define rtl8169_rx_quota(count, quota)	min(count, quota)
+#else
+#define rtl8169_rx_skb			netif_rx
+#define rtl8169_rx_quota(count, quota)	count
+#endif
+
 /* media options */
 #define MAX_UNITS 8
 static int media[MAX_UNITS] = { -1, -1, -1, -1, -1, -1, -1, -1 };
 <at>  <at>  -90,8 +98,9  <at>  <at>  static int multicast_filter_limit = 32;
 #define RxPacketMaxSize	0x0800	/* Maximum size supported is 16K-1 */
 #define InterFrameGap	0x03	/* 3 means InterFrameGap = the shortest one */

+#define R8169_NAPI_WEIGHT	64
 #define NUM_TX_DESC	64	/* Number of Tx descriptor registers */
-#define NUM_RX_DESC	64	/* Number of Rx descriptor registers */
+#define NUM_RX_DESC	256	/* Number of Rx descriptor registers */
(Continue reading)

Francois Romieu | 1 May 02:23 2004

[patch 0/7] 2.6.6-rc3-mm1 - description of the r8169 queue

With some delay, the following patches include Jon D Mason's NAPI changes
(+ fixes) and code from Andy Lutomirski with minor changes.
I have not moved the initialization of the phy timer in the pci probe
routine as:
- I believe it belongs to the netdevice;
- it should work as is without significant change for the user.

If someone sees a good reason to move it, just complain (with an axe).

The patches apply to 2.6.6-rc3 as well as to 2.6.6-rc3-mm1.

All the patches are merged in a single patch against 2.6.6-rc3 available at:
http://www.fr.zoreil.com/people/francois/misc/20040501-2.6.6-rc3-r8169.c-test.patch

The patches are archived below as well:
http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.6-rc3

If the patches prove to behave decently on 2.6, a backport for 2.4.x will be
generated.

--
Ueimor

Francois Romieu | 1 May 02:26 2004

[patch 3/7] 2.6.6-rc3-mm1 - r8169 register rename


RxUnderrun status bit renamed to LinkChg (identical to the 8139cp driver).

diff -puN drivers/net/r8169.c~r8169-register-rename drivers/net/r8169.c
--- linux-2.6.6-rc3/drivers/net/r8169.c~r8169-register-rename	2004-05-01 01:46:33.000000000 +0200
+++ linux-2.6.6-rc3-fr/drivers/net/r8169.c	2004-05-01 01:46:33.000000000 +0200
 <at>  <at>  -203,7 +203,7  <at>  <at>  enum RTL8169_register_content {
 	SWInt = 0x0100,
 	TxDescUnavail = 0x80,
 	RxFIFOOver = 0x40,
-	RxUnderrun = 0x20,
+	LinkChg = 0x20,
 	RxOverflow = 0x10,
 	TxErr = 0x08,
 	TxOK = 0x04,
 <at>  <at>  -359,9 +359,9  <at>  <at>  static int rtl8169_poll(struct net_devic
 #endif

 static const u16 rtl8169_intr_mask =
-	RxUnderrun | RxOverflow | RxFIFOOver | TxErr | TxOK | RxErr | RxOK;
+	LinkChg | RxOverflow | RxFIFOOver | TxErr | TxOK | RxErr | RxOK;
 static const u16 rtl8169_napi_event =
-	RxOK | RxUnderrun | RxOverflow | RxFIFOOver | TxOK | TxErr;
+	RxOK | LinkChg | RxOverflow | RxFIFOOver | TxOK | TxErr;
 static const unsigned int rtl8169_rx_config =
     (RX_FIFO_THRESH << RxCfgFIFOShift) | (RX_DMA_BURST << RxCfgDMAShift);

 <at>  <at>  -1569,7 +1569,7  <at>  <at>  rtl8169_interrupt(int irq, void *dev_ins

 		handled = 1;
(Continue reading)

Francois Romieu | 1 May 02:25 2004

[patch 2/7] 2.6.6-rc3-mm1 - r8169 janitoring


Spring cleanup
- missing ULL qualifier;
- unsigned int (u32) should be slightly faster on ppc64 (Jon D Mason);
- misc minor de-uglyfication.

diff -puN drivers/net/r8169.c~r8169-janitorial drivers/net/r8169.c
--- linux-2.6.6-rc3/drivers/net/r8169.c~r8169-janitorial	2004-05-01 01:46:31.000000000 +0200
+++ linux-2.6.6-rc3-fr/drivers/net/r8169.c	2004-05-01 01:46:31.000000000 +0200
 <at>  <at>  -315,10 +315,10  <at>  <at>  struct RxDesc {
 };

 struct rtl8169_private {
-	void *mmio_addr;	/* memory map physical address */
+	void *mmio_addr;		/* memory map physical address */
 	struct pci_dev *pci_dev;	/* Index of PCI device  */
 	struct net_device_stats stats;	/* statistics of net device */
-	spinlock_t lock;	/* spin lock flag */
+	spinlock_t lock;		/* spin lock flag */
 	int chipset;
 	int mac_version;
 	int phy_version;
 <at>  <at>  -326,12 +326,12  <at>  <at>  struct rtl8169_private {
 	u32 cur_tx; /* Index into the Tx descriptor buffer of next Rx pkt. */
 	u32 dirty_rx;
 	u32 dirty_tx;
-	struct TxDesc *TxDescArray;	/* Index of 256-alignment Tx Descriptor buffer */
-	struct RxDesc *RxDescArray;	/* Index of 256-alignment Rx Descriptor buffer */
+	struct TxDesc *TxDescArray;	/* 256-aligned Tx descriptor ring */
+	struct RxDesc *RxDescArray;	/* 256-aligned Rx descriptor ring */
(Continue reading)

Francois Romieu | 1 May 02:28 2004

[patch 4/7] 2.6.6-rc3-mm1 - r8169 ethtool .set_settings


ethtool set_settings support (Andy Lutomirski <luto <at> myrealbox.com>).

diff -puN drivers/net/r8169.c~r8169-ethtool-set_settings drivers/net/r8169.c
--- linux-2.6.6-rc3/drivers/net/r8169.c~r8169-ethtool-set_settings	2004-05-01
01:46:36.000000000 +0200
+++ linux-2.6.6-rc3-fr/drivers/net/r8169.c	2004-05-01 01:46:36.000000000 +0200
 <at>  <at>  -413,8 +413,68  <at>  <at>  static void rtl8169_get_drvinfo(struct n
 	strcpy(info->bus_info, pci_name(tp->pci_dev));
 }

+static void rtl8169_set_speed(struct net_device *dev,
+			      u8 autoneg, u16 speed, u8 duplex)
+{
+	struct rtl8169_private *tp = netdev_priv(dev);
+	void *ioaddr = tp->mmio_addr;
+	int auto_nego, giga_ctrl;
+	u8 status;
+
+	status = RTL_R8(PHYstatus);
+	if ((status & TBI_Enable) && (autoneg == AUTONEG_DISABLE)) {
+		autoneg = AUTONEG_ENABLE;
+		printk(KERN_WARNING PFX
+		       "%s: ignoring request to force speed in TBI mode\n",
+		       dev->name);
+	}
+
+	auto_nego = mdio_read(ioaddr, PHY_AUTO_NEGO_REG);
+	auto_nego &= ~(PHY_Cap_10_Half | PHY_Cap_10_Full |
+		       PHY_Cap_100_Half | PHY_Cap_100_Full);
(Continue reading)

Francois Romieu | 1 May 02:29 2004

[patch 5/7] 2.6.6-rc3-mm1 - r8169 ethtool .get_settings


ethtool get_settings() for r8169 (Andy Lutomirski <luto <at> myrealbox.com>).

The locking does not need to be specially clever.

diff -puN drivers/net/r8169.c~r8169-ethtool-get_settings drivers/net/r8169.c
--- linux-2.6.6-rc3/drivers/net/r8169.c~r8169-ethtool-get_settings	2004-05-01
01:46:38.000000000 +0200
+++ linux-2.6.6-rc3-fr/drivers/net/r8169.c	2004-05-01 01:46:38.000000000 +0200
 <at>  <at>  -336,6 +336,8  <at>  <at>  struct rtl8169_private {
 	unsigned long phy_link_down_cnt;
 	u16 cp_cmd;
 	u16 intr_mask;
+	int phy_auto_nego_reg;
+	int phy_1000_ctrl_reg;
 };

 MODULE_AUTHOR("Realtek");
 <at>  <at>  -451,6 +453,9  <at>  <at>  static void rtl8169_set_speed(struct net
 			auto_nego &= ~(PHY_Cap_10_Full | PHY_Cap_100_Full);
 	}

+	tp->phy_auto_nego_reg = auto_nego;
+	tp->phy_1000_ctrl_reg = giga_ctrl;
+
 	if (!(status & TBI_Enable)) {
 		mdio_write(ioaddr, PHY_AUTO_NEGO_REG, auto_nego);
 		mdio_write(ioaddr, PHY_1000_CTRL_REG, giga_ctrl);
 <at>  <at>  -460,6 +465,56  <at>  <at>  static void rtl8169_set_speed(struct net
 		   PHY_Enable_Auto_Nego | PHY_Restart_Auto_Nego);
(Continue reading)

Francois Romieu | 1 May 02:30 2004

[patch 6/7] 2.6.6-rc3-mm1 - r8169 link handling rework (1/2)


Link handling changes (Andy Lutomirski <luto <at> myrealbox.com>):
- the LinkChg irq enables the phy timer when the link goes down;
- the phy timer is enabled in rtl8169_set_speed() to protect against
  link negociation failure;
- removed rtl8169_hw_phy_reset() and its busy loop; 
- added spinlocking in timer context for rtl8169_phy_timer to avoid
  messing with the {set/get}_settings commands issued via ethtool.

diff -puN drivers/net/r8169.c~r8169-link-00 drivers/net/r8169.c
--- linux-2.6.6-rc3/drivers/net/r8169.c~r8169-link-00	2004-05-01 01:46:40.000000000 +0200
+++ linux-2.6.6-rc3-fr/drivers/net/r8169.c	2004-05-01 01:46:40.000000000 +0200
 <at>  <at>  -41,6 +41,7  <at>  <at>  VERSION 1.2	<2002/11/30>
 #include <linux/etherdevice.h>
 #include <linux/delay.h>
 #include <linux/ethtool.h>
+#include <linux/mii.h>
 #include <linux/crc32.h>
 #include <linux/init.h>
 #include <linux/dma-mapping.h>
 <at>  <at>  -333,7 +334,8  <at>  <at>  struct rtl8169_private {
 	struct sk_buff *Rx_skbuff[NUM_RX_DESC];	/* Rx data buffers */
 	struct sk_buff *Tx_skbuff[NUM_TX_DESC];	/* Tx data buffers */
 	struct timer_list timer;
-	unsigned long phy_link_down_cnt;
+	unsigned int phy_tried_renegotiate;
+	unsigned int phy_reset_warned;
 	u16 cp_cmd;
 	u16 intr_mask;
 	int phy_auto_nego_reg;
(Continue reading)

Francois Romieu | 1 May 02:31 2004

[patch 7/7] 2.6.6-rc3-mm1 - r8169 link handling rework (2/2)

Use rtl8169_set_speed() for link setup in rtl8169_init_one():
- the code whic handles the option checking is isolated;
- display (once) a notice message about the deprecated interface;
- rtl8169_open() enables the phy timer if the link is not up;
- rtl8169_set_speed() checks that the netdevice is actually ready
  in order to activate the timer.

diff -puN drivers/net/r8169.c~r8169-link-10 drivers/net/r8169.c
--- linux-2.6.6-rc3/drivers/net/r8169.c~r8169-link-10	2004-05-01 01:46:42.000000000 +0200
+++ linux-2.6.6-rc3-fr/drivers/net/r8169.c	2004-05-01 01:46:42.000000000 +0200
 <at>  <at>  -7,7 +7,7  <at>  <at> 
  Feb  4 2002	- created initially by ShuChen <shuchen <at> realtek.com.tw>.
  May 20 2002	- Add link status force-mode and TBI mode support.
 =========================================================================
-  1. The media can be forced in 5 modes.
+  1. [DEPRECATED: use ethtool instead] The media can be forced in 5 modes.
 	 Command: 'insmod r8169 media = SET_MEDIA'
 	 Ex:	  'insmod r8169 media = 0x04' will force PHY to operate in 100Mpbs Half-duplex.
 	
 <at>  <at>  -432,6 +432,37  <at>  <at>  static void rtl8169_check_link_status(st
 	spin_unlock_irqrestore(&tp->lock, flags);
 }

+static void rtl8169_link_option(int idx, u8 *autoneg, u16 *speed, u8 *duplex)
+{
+	struct {
+		u16 speed;
+		u8 duplex;
+		u8 autoneg;
+		u8 media;
(Continue reading)

Nivedita Singhvi | 1 May 05:49 2004
Picon

Re: [PATCH 2.6.5] Re: Fw: Stack sends SYN/ACKs even though accept queue is full]

> Attaching a patch which adds a sysctl to turn off this
> behaviour.   Could you test this, please?  Patch against
> 2.6.5 vanilla kernel.  If you need a 2.4 version, let me
> know. 

Oops, Andrew pointed out that my mailer had done
the nasty and wrapped my patch. Resending..

thanks,
Nivedita

diff -urN linux-2.6.5/include/linux/sysctl.h
linux-2.6.5synq/include/linux/sysctl.h
--- linux-2.6.5/include/linux/sysctl.h	2004-04-03 19:37:23.000000000 -0800
+++ linux-2.6.5synq/include/linux/sysctl.h	2004-04-30 15:24:12.000000000 -0700
 <at>  <at>  -323,6 +323,7  <at>  <at> 
 	NET_IPV4_IPFRAG_SECRET_INTERVAL=94,
 	NET_TCP_WESTWOOD=95,
 	NET_IPV4_IGMP_MAX_MSF=96,
+	NET_TCP_PRELOAD_SYNQ=97,
 };

 enum {
diff -urN linux-2.6.5/include/net/tcp.h
linux-2.6.5synq/include/net/tcp.h
--- linux-2.6.5/include/net/tcp.h	2004-04-03 19:36:18.000000000 -0800
+++ linux-2.6.5synq/include/net/tcp.h	2004-04-30 13:55:18.000000000 -0700
 <at>  <at>  -583,6 +583,7  <at>  <at> 
 extern int sysctl_tcp_frto;
 extern int sysctl_tcp_low_latency;
(Continue reading)


Gmane