Amerigo Wang | 5 Apr 11:12 2010
Picon

[v2 Patch 1/3] netpoll: add generic support for bridge and bonding devices

V2:
Fix some bugs of previous version.
Remove ->netpoll_setup and ->netpoll_xmit, they are not necessary.
Don't poll all underlying devices, poll ->real_dev in struct netpoll.
Thanks to David for suggesting above.

--------->

This whole patchset is for adding netpoll support to bridge and bonding
devices. I already tested it for bridge, bonding, bridge over bonding,
and bonding over bridge. It looks fine now.

Please comment.

To make bridge and bonding support netpoll, we need to adjust
some netpoll generic code. This patch does the following things:

1) introduce two new priv_flags for struct net_device:
   IFF_IN_NETPOLL which identifies we are processing a netpoll;
   IFF_DISABLE_NETPOLL is used to disable netpoll support for a device
   at run-time;

2) introduce one new method for netdev_ops:
   ->ndo_netpoll_cleanup() is used to clean up netpoll when a device is
     removed.

3) introduce netpoll_poll_dev() which takes a struct net_device * parameter;
   export netpoll_send_skb() and netpoll_poll_dev() which will be used later;

4) hide a pointer to struct netpoll in struct netpoll_info, ditto.
(Continue reading)

Amerigo Wang | 5 Apr 11:12 2010
Picon

[v2 Patch 3/3] bonding: make bonding support netpoll


Based on Andy's work, but I modified a lot.

Similar to the patch for bridge, this patch does:

1) implement the 2 methods to support netpoll for bonding;

2) modify netpoll during forwarding packets via bonding;

3) disable netpoll support of bonding when a netpoll-unabled device
   is added to bonding;

4) enable netpoll support when all underlying devices support netpoll.

Cc: Andy Gospodarek <gospo <at> redhat.com>
Cc: Jeff Moyer <jmoyer <at> redhat.com>
Cc: Matt Mackall <mpm <at> selenic.com>
Cc: Neil Horman <nhorman <at> tuxdriver.com>
Cc: Jay Vosburgh <fubar <at> us.ibm.com>
Cc: David Miller <davem <at> davemloft.net>
Signed-off-by: WANG Cong <amwang <at> redhat.com>

---

Index: linux-2.6/drivers/net/bonding/bond_main.c
===================================================================
--- linux-2.6.orig/drivers/net/bonding/bond_main.c
+++ linux-2.6/drivers/net/bonding/bond_main.c
 <at>  <at>  -59,6 +59,7  <at>  <at> 
 #include <linux/uaccess.h>
(Continue reading)

Amerigo Wang | 5 Apr 11:12 2010
Picon

[v2 Patch 2/3] bridge: make bridge support netpoll


Based on the previous patch, make bridge support netpoll by:

1) implement the 2 methods to support netpoll for bridge;

2) modify netpoll during forwarding packets via bridge;

3) disable netpoll support of bridge when a netpoll-unabled device
   is added to bridge;

4) enable netpoll support when all underlying devices support netpoll.

Cc: David Miller <davem <at> davemloft.net>
Cc: Neil Horman <nhorman <at> tuxdriver.com>
Cc: Stephen Hemminger <shemminger <at> linux-foundation.org>
Cc: Matt Mackall <mpm <at> selenic.com>
Signed-off-by: WANG Cong <amwang <at> redhat.com>

---

Index: linux-2.6/net/bridge/br_device.c
===================================================================
--- linux-2.6.orig/net/bridge/br_device.c
+++ linux-2.6/net/bridge/br_device.c
 <at>  <at>  -13,8 +13,10  <at>  <at> 

 #include <linux/kernel.h>
 #include <linux/netdevice.h>
+#include <linux/netpoll.h>
 #include <linux/etherdevice.h>
(Continue reading)

Kalyan Chakravadhanula | 1 Apr 21:32 2010

installation instructions

Hi,
 
I downloaded the bridge-utils-1.4. Could you please point me as to how to install this utility?
 
Thanks,
Kalyan
_______________________________________________
Bridge mailing list
Bridge <at> lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/bridge
Andy Gospodarek | 5 Apr 21:43 2010
Picon

Re: [v2 Patch 3/3] bonding: make bonding support netpoll

On Mon, Apr 05, 2010 at 05:12:40AM -0400, Amerigo Wang wrote:
> 
> Based on Andy's work, but I modified a lot.
> 
> Similar to the patch for bridge, this patch does:
> 
> 1) implement the 2 methods to support netpoll for bonding;
> 
> 2) modify netpoll during forwarding packets via bonding;
> 
> 3) disable netpoll support of bonding when a netpoll-unabled device
>    is added to bonding;
> 
> 4) enable netpoll support when all underlying devices support netpoll.
> 
> Cc: Andy Gospodarek <gospo <at> redhat.com>
> Cc: Jeff Moyer <jmoyer <at> redhat.com>
> Cc: Matt Mackall <mpm <at> selenic.com>
> Cc: Neil Horman <nhorman <at> tuxdriver.com>
> Cc: Jay Vosburgh <fubar <at> us.ibm.com>
> Cc: David Miller <davem <at> davemloft.net>
> Signed-off-by: WANG Cong <amwang <at> redhat.com>
> 

I tried these patches on top of Linus' latest tree and still get
deadlocks.  Your line numbers might differ a bit, but you should be
seeing them too.

# echo 7 4 1 7 > /proc/sys/kernel/printk 
# ifup bond0 
bonding: bond0: setting mode to balance-rr (0).                                                 
bonding: bond0: Setting MII monitoring interval to 1000.                                        
ADDRCONF(NETDEV_UP): bond0: link is not ready                                                   
bonding: bond0: Adding slave eth4.                                                              
bnx2 0000:10:00.0: eth4: using MSIX                                                             
bonding: bond0: enslaving eth4 as an active interface with a down link.                         
bonding: bond0: Adding slave eth5.                                                              
bnx2 0000:10:00.1: eth5: using MSIX                                                             
bonding: bond0: enslaving eth5 as an active interface with a down link.                         
bnx2 0000:10:00.0: eth4: NIC Copper Link is Up, 100 Mbps full duplex,
receive & transmit flow control ON
bonding: bond0: link status definitely up for interface eth4.                                   
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready                                              
bnx2 0000:10:00.1: eth5: NIC Copper Link is Up, 100 Mbps full duplex,
receive & transmit flow control ON
bond0: IPv6 duplicate address fe80::210:18ff:fe36:ad4 detected!                                 
bonding: bond0: link status definitely up for interface eth5.                                   
# cat /proc/net/bonding/bond0                                                    
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)                                    

Bonding Mode: load balancing (round-robin)                                                      
MII Status: up                                                                                  
MII Polling Interval (ms): 1000                                                                 
Up Delay (ms): 0                                                                                
Down Delay (ms): 0                                                                              

Slave Interface: eth4                                                                           
MII Status: up                                                                                  
Link Failure Count: 0
Permanent HW addr: 00:10:18:36:0a:d4

Slave Interface: eth5
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:10:18:36:0a:d6
# modprobe netconsole 
netconsole: local port 1234
netconsole: local IP 10.0.100.2
netconsole: interface 'bond0'
netconsole: remote port 6666
netconsole: remote IP 10.0.100.1
netconsole: remote ethernet address 00:e0:81:71:ee:aa
console [netcon0] enabled
netconsole: network logging started
# echo -eth4 > /sys/class/net/bond0/bonding/slaves  
bonding: bond0: Removing slave eth4

[ now the system is hung ]

My suspicion from dealing with this problem in the past is that there is
contention over bond->lock.

Since there statements that will result in netconsole messages inside
the write_lock_bh in bond_release:

1882         write_lock_bh(&bond->lock);
1883 
1884         slave = bond_get_slave_by_dev(bond, slave_dev);
1885         if (!slave) {
1886                 /* not a slave of this bond */
1887                 pr_info("%s: %s not enslaved\n",
1888                         bond_dev->name, slave_dev->name);
1889                 write_unlock_bh(&bond->lock);
1890                 return -EINVAL;
1891         }
1892 
1893         if (!bond->params.fail_over_mac) {
1894                 if (!compare_ether_addr(bond_dev->dev_addr, slave->perm_hwaddr) &&
1895                     bond->slave_cnt > 1)
1896                         pr_warning("%s: Warning: the permanent HWaddr of %s - %pM - is still in use by %s.

we are getting stuck at 1986 since bond_xmit_roundrobin (in my case)
will try and acquire bond->lock for reading.

One valuable aspect netpoll_start_xmit routine was that is could be used
to check to be sure that bond->lock could be taken for writing.  This
made us sure that we were not in a call stack that has already taken the
lock and queuing the skb to be sent later would prevent the imminent
deadlock.

A way to prevent this is needed and a first-pass might be to do
something similar to what I below above for all the xmit routines.  I
confirmed the following patch prevents that deadlock:

# git diff drivers/net/bonding/
diff --git a/drivers/net/bonding/bond_main.c
b/drivers/net/bonding/bond_main.c
index 4a41886..53b39cc 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
 <at>  <at>  -4232,7 +4232,8  <at>  <at>  static int bond_xmit_roundrobin(struct sk_buff *skb, struc
        int i, slave_no, res = 1;
        struct iphdr *iph = ip_hdr(skb);

-       read_lock(&bond->lock);
+       if (!read_trylock(&bond->lock))
+               return NETDEV_TX_BUSY;

        if (!BOND_IS_OK(bond))
                goto out;

The kernel no longer hangs, but a new warning message shows up (over
netconsole even!):

------------[ cut here ]------------
WARNING: at kernel/softirq.c:143 local_bh_enable+0x43/0xba()
Hardware name: HP xw4400 Workstation
Modules linked in: tg3 netconsole bonding ipt_REJECT bridge stp autofs4
i2c_dev i2c_core hidp rfcomm l2cap crc16 bluetooth rfkill sunrpc 8021q
iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter
ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath
video output sbs sbshc battery acpi_memhotplug ac lp sg ide_cd_mod
tpm_tis rtc_cmos rtc_core serio_raw cdrom libphy e1000e floppy
parport_pc parport button tpm tpm_bios bnx2 rtc_lib tulip pcspkr shpchp
dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod ata_piix ahci
libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last
unloaded: tg3]
Pid: 9, comm: events/0 Not tainted 2.6.34-rc3 #6
Call Trace:
 [<ffffffff81058754>] ? cpu_clock+0x2d/0x41
 [<ffffffff810404d9>] ? local_bh_enable+0x43/0xba
 [<ffffffff8103a350>] warn_slowpath_common+0x77/0x8f
 [<ffffffff812a4659>] ? dev_queue_xmit+0x408/0x467
 [<ffffffff8103a377>] warn_slowpath_null+0xf/0x11
 [<ffffffff810404d9>] local_bh_enable+0x43/0xba
 [<ffffffff812a4659>] dev_queue_xmit+0x408/0x467
 [<ffffffff812a435e>] ? dev_queue_xmit+0x10d/0x467
 [<ffffffffa04a3868>] bond_dev_queue_xmit+0x1cd/0x1f9 [bonding]
 [<ffffffffa04a4217>] bond_start_xmit+0x139/0x3e9 [bonding]
 [<ffffffff812b0e9a>] queue_process+0xa8/0x160
 [<ffffffff812b0df2>] ? queue_process+0x0/0x160
 [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
 [<ffffffff813362bc>] ? restore_args+0x0/0x30
 [<ffffffff81053884>] ? kthread+0x0/0x85

to point out possible locking issues (probably in netpoll_send_skb) that
I would suggest you investigate further.  It may point to why we cannot
perform an:

# rmmod bonding

without the system deadlocking (even with my patch above).

> ---
> 
> Index: linux-2.6/drivers/net/bonding/bond_main.c
> ===================================================================
> --- linux-2.6.orig/drivers/net/bonding/bond_main.c
> +++ linux-2.6/drivers/net/bonding/bond_main.c
>  <at>  <at>  -59,6 +59,7  <at>  <at> 
>  #include <linux/uaccess.h>
>  #include <linux/errno.h>
>  #include <linux/netdevice.h>
> +#include <linux/netpoll.h>
>  #include <linux/inetdevice.h>
>  #include <linux/igmp.h>
>  #include <linux/etherdevice.h>
>  <at>  <at>  -430,7 +431,18  <at>  <at>  int bond_dev_queue_xmit(struct bonding *
>  	}
>  
>  	skb->priority = 1;
> -	dev_queue_xmit(skb);
> +#ifdef CONFIG_NET_POLL_CONTROLLER
> +	if (bond->dev->priv_flags & IFF_IN_NETPOLL) {
> +		struct netpoll *np = bond->dev->npinfo->netpoll;
> +		slave_dev->npinfo = bond->dev->npinfo;
> +		np->real_dev = np->dev = skb->dev;
> +		slave_dev->priv_flags |= IFF_IN_NETPOLL;
> +		netpoll_send_skb(np, skb);
> +		slave_dev->priv_flags &= ~IFF_IN_NETPOLL;
> +		np->dev = bond->dev;
> +	} else
> +#endif
> +		dev_queue_xmit(skb);
>  
>  	return 0;
>  }
>  <at>  <at>  -1329,6 +1341,60  <at>  <at>  static void bond_detach_slave(struct bon
>  	bond->slave_cnt--;
>  }
>  
> +#ifdef CONFIG_NET_POLL_CONTROLLER
> +static bool slaves_support_netpoll(struct net_device *bond_dev)
> +{
> +	struct bonding *bond = netdev_priv(bond_dev);
> +	struct slave *slave;
> +	int i = 0;
> +	bool ret = true;
> +
> +	read_lock(&bond->lock);
> +	bond_for_each_slave(bond, slave, i) {
> +		if ((slave->dev->priv_flags & IFF_DISABLE_NETPOLL)
> +				|| !slave->dev->netdev_ops->ndo_poll_controller)
> +			ret = false;
> +	}
> +	read_unlock(&bond->lock);
> +	return i != 0 && ret;
> +}
> +
> +static void bond_poll_controller(struct net_device *bond_dev)
> +{
> +	struct net_device *dev = bond_dev->npinfo->netpoll->real_dev;
> +	if (dev != bond_dev)
> +		netpoll_poll_dev(dev);
> +}
> +
> +static void bond_netpoll_cleanup(struct net_device *bond_dev)
> +{
> +	struct bonding *bond = netdev_priv(bond_dev);
> +	struct slave *slave;
> +	const struct net_device_ops *ops;
> +	int i;
> +
> +	read_lock(&bond->lock);
> +	bond_dev->npinfo = NULL;
> +	bond_for_each_slave(bond, slave, i) {
> +		if (slave->dev) {
> +			ops = slave->dev->netdev_ops;
> +			if (ops->ndo_netpoll_cleanup)
> +				ops->ndo_netpoll_cleanup(slave->dev);
> +			else
> +				slave->dev->npinfo = NULL;
> +		}
> +	}
> +	read_unlock(&bond->lock);
> +}
> +
> +#else
> +
> +static void bond_netpoll_cleanup(struct net_device *bond_dev)
> +{
> +}
> +
> +#endif
> +
>  /*---------------------------------- IOCTL ----------------------------------*/
>  
>  static int bond_sethwaddr(struct net_device *bond_dev,
>  <at>  <at>  -1746,6 +1812,18  <at>  <at>  int bond_enslave(struct net_device *bond
>  		new_slave->state == BOND_STATE_ACTIVE ? "n active" : " backup",
>  		new_slave->link != BOND_LINK_DOWN ? "n up" : " down");
>  
> +#ifdef CONFIG_NET_POLL_CONTROLLER
> +	if (slaves_support_netpoll(bond_dev)) {
> +		bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
> +		if (bond_dev->npinfo)
> +			slave_dev->npinfo = bond_dev->npinfo;
> +	} else if (!(bond_dev->priv_flags & IFF_DISABLE_NETPOLL)) {
> +		bond_dev->priv_flags |= IFF_DISABLE_NETPOLL;
> +		pr_info("New slave device %s does not support netpoll\n",
> +			slave_dev->name);
> +		pr_info("Disabling netpoll support for %s\n", bond_dev->name);
> +	}
> +#endif
>  	/* enslave is successful */
>  	return 0;
>  
>  <at>  <at>  -1929,6 +2007,15  <at>  <at>  int bond_release(struct net_device *bond
>  
>  	netdev_set_master(slave_dev, NULL);
>  
> +#ifdef CONFIG_NET_POLL_CONTROLLER
> +	if (slaves_support_netpoll(bond_dev))
> +		bond_dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
> +	if (slave_dev->netdev_ops->ndo_netpoll_cleanup)
> +		slave_dev->netdev_ops->ndo_netpoll_cleanup(slave_dev);
> +	else
> +		slave_dev->npinfo = NULL;
> +#endif
> +
>  	/* close slave before restoring its mac address */
>  	dev_close(slave_dev);
>  
>  <at>  <at>  -4448,6 +4535,10  <at>  <at>  static const struct net_device_ops bond_
>  	.ndo_vlan_rx_register	= bond_vlan_rx_register,
>  	.ndo_vlan_rx_add_vid 	= bond_vlan_rx_add_vid,
>  	.ndo_vlan_rx_kill_vid	= bond_vlan_rx_kill_vid,
> +#ifdef CONFIG_NET_POLL_CONTROLLER
> +	.ndo_netpoll_cleanup	= bond_netpoll_cleanup,
> +	.ndo_poll_controller	= bond_poll_controller,
> +#endif
>  };
>  
>  static void bond_setup(struct net_device *bond_dev)
>  <at>  <at>  -4533,6 +4624,8  <at>  <at>  static void bond_uninit(struct net_devic
>  {
>  	struct bonding *bond = netdev_priv(bond_dev);
>  
> +	bond_netpoll_cleanup(bond_dev);
> +
>  	/* Release the bonded slaves */
>  	bond_release_all(bond_dev);
>  
Cong Wang | 6 Apr 04:43 2010
Picon

Re: [v2 Patch 3/3] bonding: make bonding support netpoll

Andy Gospodarek wrote:
> 
> I tried these patches on top of Linus' latest tree and still get
> deadlocks.  Your line numbers might differ a bit, but you should be
> seeing them too.
> 

Yeah, my local clone is some days behind Linus' latest tree. :)

> # echo 7 4 1 7 > /proc/sys/kernel/printk 
> # ifup bond0 
> bonding: bond0: setting mode to balance-rr (0).                                                 
> bonding: bond0: Setting MII monitoring interval to 1000.                                        
> ADDRCONF(NETDEV_UP): bond0: link is not ready                                                   
> bonding: bond0: Adding slave eth4.                                                              
> bnx2 0000:10:00.0: eth4: using MSIX                                                             
> bonding: bond0: enslaving eth4 as an active interface with a down link.                         
> bonding: bond0: Adding slave eth5.                                                              
> bnx2 0000:10:00.1: eth5: using MSIX                                                             
> bonding: bond0: enslaving eth5 as an active interface with a down link.                         
> bnx2 0000:10:00.0: eth4: NIC Copper Link is Up, 100 Mbps full duplex,
> receive & transmit flow control ON
> bonding: bond0: link status definitely up for interface eth4.                                   
> ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready                                              
> bnx2 0000:10:00.1: eth5: NIC Copper Link is Up, 100 Mbps full duplex,
> receive & transmit flow control ON
> bond0: IPv6 duplicate address fe80::210:18ff:fe36:ad4 detected!                                 
> bonding: bond0: link status definitely up for interface eth5.                                   
> # cat /proc/net/bonding/bond0                                                    
> Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)                                    
>                                                                                                 
> Bonding Mode: load balancing (round-robin)                                                      
> MII Status: up                                                                                  
> MII Polling Interval (ms): 1000                                                                 
> Up Delay (ms): 0                                                                                
> Down Delay (ms): 0                                                                              
>                                                                                                 
> Slave Interface: eth4                                                                           
> MII Status: up                                                                                  
> Link Failure Count: 0
> Permanent HW addr: 00:10:18:36:0a:d4
> 
> Slave Interface: eth5
> MII Status: up
> Link Failure Count: 0
> Permanent HW addr: 00:10:18:36:0a:d6
> # modprobe netconsole 
> netconsole: local port 1234
> netconsole: local IP 10.0.100.2
> netconsole: interface 'bond0'
> netconsole: remote port 6666
> netconsole: remote IP 10.0.100.1
> netconsole: remote ethernet address 00:e0:81:71:ee:aa
> console [netcon0] enabled
> netconsole: network logging started
> # echo -eth4 > /sys/class/net/bond0/bonding/slaves  
> bonding: bond0: Removing slave eth4
> 
> [ now the system is hung ]
> 
> My suspicion from dealing with this problem in the past is that there is
> contention over bond->lock.
> 
> Since there statements that will result in netconsole messages inside
> the write_lock_bh in bond_release:
> 
> 1882         write_lock_bh(&bond->lock);
> 1883 
> 1884         slave = bond_get_slave_by_dev(bond, slave_dev);
> 1885         if (!slave) {
> 1886                 /* not a slave of this bond */
> 1887                 pr_info("%s: %s not enslaved\n",
> 1888                         bond_dev->name, slave_dev->name);
> 1889                 write_unlock_bh(&bond->lock);
> 1890                 return -EINVAL;
> 1891         }
> 1892 
> 1893         if (!bond->params.fail_over_mac) {
> 1894                 if (!compare_ether_addr(bond_dev->dev_addr, slave->perm_hwaddr) &&
> 1895                     bond->slave_cnt > 1)
> 1896                         pr_warning("%s: Warning: the permanent HWaddr of %s - %pM - is still in use by %s.
> 
> we are getting stuck at 1986 since bond_xmit_roundrobin (in my case)
> will try and acquire bond->lock for reading.
> 
> One valuable aspect netpoll_start_xmit routine was that is could be used
> to check to be sure that bond->lock could be taken for writing.  This
> made us sure that we were not in a call stack that has already taken the
> lock and queuing the skb to be sent later would prevent the imminent
> deadlock.
> 
> A way to prevent this is needed and a first-pass might be to do
> something similar to what I below above for all the xmit routines.  I
> confirmed the following patch prevents that deadlock:
> 
> # git diff drivers/net/bonding/
> diff --git a/drivers/net/bonding/bond_main.c
> b/drivers/net/bonding/bond_main.c
> index 4a41886..53b39cc 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
>  <at>  <at>  -4232,7 +4232,8  <at>  <at>  static int bond_xmit_roundrobin(struct sk_buff *skb, struc
>         int i, slave_no, res = 1;
>         struct iphdr *iph = ip_hdr(skb);
>  
> -       read_lock(&bond->lock);
> +       if (!read_trylock(&bond->lock))
> +               return NETDEV_TX_BUSY;
>  
>         if (!BOND_IS_OK(bond))
>                 goto out;
> 
> The kernel no longer hangs, but a new warning message shows up (over
> netconsole even!):
> 
> ------------[ cut here ]------------
> WARNING: at kernel/softirq.c:143 local_bh_enable+0x43/0xba()
> Hardware name: HP xw4400 Workstation
> Modules linked in: tg3 netconsole bonding ipt_REJECT bridge stp autofs4
> i2c_dev i2c_core hidp rfcomm l2cap crc16 bluetooth rfkill sunrpc 8021q
> iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter
> ip6_tables x_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath
> video output sbs sbshc battery acpi_memhotplug ac lp sg ide_cd_mod
> tpm_tis rtc_cmos rtc_core serio_raw cdrom libphy e1000e floppy
> parport_pc parport button tpm tpm_bios bnx2 rtc_lib tulip pcspkr shpchp
> dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod ata_piix ahci
> libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last
> unloaded: tg3]
> Pid: 9, comm: events/0 Not tainted 2.6.34-rc3 #6
> Call Trace:
>  [<ffffffff81058754>] ? cpu_clock+0x2d/0x41
>  [<ffffffff810404d9>] ? local_bh_enable+0x43/0xba
>  [<ffffffff8103a350>] warn_slowpath_common+0x77/0x8f
>  [<ffffffff812a4659>] ? dev_queue_xmit+0x408/0x467
>  [<ffffffff8103a377>] warn_slowpath_null+0xf/0x11
>  [<ffffffff810404d9>] local_bh_enable+0x43/0xba
>  [<ffffffff812a4659>] dev_queue_xmit+0x408/0x467
>  [<ffffffff812a435e>] ? dev_queue_xmit+0x10d/0x467
>  [<ffffffffa04a3868>] bond_dev_queue_xmit+0x1cd/0x1f9 [bonding]
>  [<ffffffffa04a4217>] bond_start_xmit+0x139/0x3e9 [bonding]
>  [<ffffffff812b0e9a>] queue_process+0xa8/0x160
>  [<ffffffff812b0df2>] ? queue_process+0x0/0x160
>  [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
>  [<ffffffff813362bc>] ? restore_args+0x0/0x30
>  [<ffffffff81053884>] ? kthread+0x0/0x85
> 
> to point out possible locking issues (probably in netpoll_send_skb) that
> I would suggest you investigate further.  It may point to why we cannot
> perform an:
> 
> # rmmod bonding
> 
> without the system deadlocking (even with my patch above).
> 

Thanks a lot for testing!

Before I try to reproduce it, could you please try to replace the 'read_lock()'
in slaves_support_netpoll() with 'read_lock_bh()'? (read_unlock() too) Try if this helps.

After I reproduce this, I will try it too.
Cong Wang | 6 Apr 06:38 2010
Picon

Re: [v2 Patch 3/3] bonding: make bonding support netpoll

Cong Wang wrote:
> Before I try to reproduce it, could you please try to replace the 
> 'read_lock()'
> in slaves_support_netpoll() with 'read_lock_bh()'? (read_unlock() too) 
> Try if this helps.
> 

Confirmed. Please use the attached patch instead, for your testing.

Thanks!

_______________________________________________
Bridge mailing list
Bridge <at> lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/bridge
Andy Gospodarek | 6 Apr 16:48 2010
Picon

Re: [v2 Patch 3/3] bonding: make bonding support netpoll

On Tue, Apr 06, 2010 at 12:38:16PM +0800, Cong Wang wrote:
> Cong Wang wrote:
>> Before I try to reproduce it, could you please try to replace the  
>> 'read_lock()'
>> in slaves_support_netpoll() with 'read_lock_bh()'? (read_unlock() too)  
>> Try if this helps.
>>
>
> Confirmed. Please use the attached patch instead, for your testing.
>
> Thanks!
>

Moving those locks to bh-locks will not resolve this.  I tried that
yesterday and tried your new patch today without success.  That warning
is a WARN_ON_ONCE so you need to reboot to see that it is still a
problem.  Simply unloading and loading the new module is not an accurate
test.

Also, my system still hangs when removing the bonding module.  I do not
think you intended to fix this with the patch, but wanted it to be clear
to everyone on the list.

You should also configure your kernel with a some of the lock debugging
enabled.  I've been using the following:

CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_LOCK_STAT=y
CONFIG_DEBUG_LOCKDEP=y

Here is the output when I remove a slave from the bond.  My
xmit_roundrobin patch from earlier (replacing read_lock with
read_trylock) was applied.  It might be helpful for you when debugging
these issues.

------------[ cut here ]------------
WARNING: at kernel/softirq.c:143 local_bh_enable+0x43/0xba()
Hardware name: HP xw4400 Workstation
Modules linked in: netconsole bonding ipt_REJECT bridge stp autofs4 i2c_dev i2c_core hidp rfcomm
l2cap crc16 bluetooth rfki]
Pid: 10, comm: events/1 Not tainted 2.6.34-rc3 #6
Call Trace:
 [<ffffffff81058754>] ? cpu_clock+0x2d/0x41
 [<ffffffff810404d9>] ? local_bh_enable+0x43/0xba
 [<ffffffff8103a350>] warn_slowpath_common+0x77/0x8f
 [<ffffffff812a4659>] ? dev_queue_xmit+0x408/0x467
 [<ffffffff8103a377>] warn_slowpath_null+0xf/0x11
 [<ffffffff810404d9>] local_bh_enable+0x43/0xba
 [<ffffffff812a4659>] dev_queue_xmit+0x408/0x467
 [<ffffffff812a435e>] ? dev_queue_xmit+0x10d/0x467
 [<ffffffffa04a383f>] bond_dev_queue_xmit+0x1cd/0x1f9 [bonding]
 [<ffffffffa04a41ee>] bond_start_xmit+0x139/0x3e9 [bonding]
 [<ffffffff812b0e9a>] queue_process+0xa8/0x160
 [<ffffffff812b0df2>] ? queue_process+0x0/0x160
 [<ffffffff81050bfb>] worker_thread+0x1af/0x2ae
 [<ffffffff81050ba2>] ? worker_thread+0x156/0x2ae
 [<ffffffff81053c34>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff81050a4c>] ? worker_thread+0x0/0x2ae
 [<ffffffff81053901>] kthread+0x7d/0x85
 [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
 [<ffffffff813362bc>] ? restore_args+0x0/0x30
 [<ffffffff81053884>] ? kthread+0x0/0x85
 [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10
---[ end trace 241f49bf65e0f4f0 ]---

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34-rc3 #6
---------------------------------------------------------
events/1/10 just changed the state of lock:
 (&bonding_netdev_xmit_lock_key){+.+...}, at: [<ffffffff812b0e75>] queue_process+0x83/0x160
but this lock was taken by another, SOFTIRQ-safe lock in the past:
 (&(&dev->tx_global_lock)->rlock){+.-...}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
4 locks held by events/1/10:
 #0:  (events){+.+.+.}, at: [<ffffffff81050ba2>] worker_thread+0x156/0x2ae
 #1:  ((&(&npinfo->tx_work)->work)){+.+...}, at: [<ffffffff81050ba2>] worker_thread+0x156/0x2ae
 #2:  (&bonding_netdev_xmit_lock_key){+.+...}, at: [<ffffffff812b0e75>] queue_process+0x83/0x160
 #3:  (&bond->lock){++.+..}, at: [<ffffffffa04a4107>] bond_start_xmit+0x52/0x3e9 [bonding]

the shortest dependencies between 2nd lock and 1st lock:
 -> (&(&dev->tx_global_lock)->rlock){+.-...} ops: 129 {
    HARDIRQ-ON-W at:
                          [<ffffffff810651ef>] __lock_acquire+0x643/0x813
                          [<ffffffff81065487>] lock_acquire+0xc8/0xed
                          [<ffffffff81335742>] _raw_spin_lock+0x31/0x66
                          [<ffffffff812b64bd>] dev_deactivate+0x6f/0x195
                          [<ffffffff812ad7c4>] linkwatch_do_dev+0x9a/0xae
                          [<ffffffff812ada6a>] __linkwatch_run_queue+0x106/0x14a
                          [<ffffffff812adad8>] linkwatch_event+0x2a/0x31
                          [<ffffffff81050bfb>] worker_thread+0x1af/0x2ae
                          [<ffffffff81053901>] kthread+0x7d/0x85
                          [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
    IN-SOFTIRQ-W at:
                          [<ffffffff810651a3>] __lock_acquire+0x5f7/0x813
                          [<ffffffff81065487>] lock_acquire+0xc8/0xed
                          [<ffffffff81335742>] _raw_spin_lock+0x31/0x66
                          [<ffffffff812b6606>] dev_watchdog+0x23/0x1f2
                          [<ffffffff8104701b>] run_timer_softirq+0x1d1/0x285
                          [<ffffffff81040021>] __do_softirq+0xdb/0x1ab
                          [<ffffffff8100388c>] call_softirq+0x1c/0x34
                          [<ffffffff81004f9d>] do_softirq+0x38/0x83
                          [<ffffffff8103ff44>] irq_exit+0x45/0x47
                          [<ffffffff810193bc>] smp_apic_timer_interrupt+0x88/0x98
                          [<ffffffff81003353>] apic_timer_interrupt+0x13/0x20
                          [<ffffffff81001a21>] cpu_idle+0x4d/0x6b
                          [<ffffffff8131da3a>] rest_init+0xbe/0xc2
                          [<ffffffff81a00d4e>] start_kernel+0x38c/0x399
                          [<ffffffff81a002a5>] x86_64_start_reservations+0xb5/0xb9
                          [<ffffffff81a0038f>] x86_64_start_kernel+0xe6/0xed
    INITIAL USE at:
                         [<ffffffff8106525c>] __lock_acquire+0x6b0/0x813
                         [<ffffffff81065487>] lock_acquire+0xc8/0xed
                         [<ffffffff81335742>] _raw_spin_lock+0x31/0x66
                         [<ffffffff812b64bd>] dev_deactivate+0x6f/0x195
                         [<ffffffff812ad7c4>] linkwatch_do_dev+0x9a/0xae
                         [<ffffffff812ada6a>] __linkwatch_run_queue+0x106/0x14a
                         [<ffffffff812adad8>] linkwatch_event+0x2a/0x31
                         [<ffffffff81050bfb>] worker_thread+0x1af/0x2ae
                         [<ffffffff81053901>] kthread+0x7d/0x85
                         [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
  }
  ... key      at: [<ffffffff8282ceb0>] __key.51521+0x0/0x8
  ... acquired at:
   [<ffffffff810649f9>] validate_chain+0xb87/0xd3a
   [<ffffffff81065359>] __lock_acquire+0x7ad/0x813
   [<ffffffff81065487>] lock_acquire+0xc8/0xed
   [<ffffffff81335742>] _raw_spin_lock+0x31/0x66
   [<ffffffff812b64e4>] dev_deactivate+0x96/0x195
   [<ffffffff812a17fc>] __dev_close+0x69/0x86
   [<ffffffff8129f8ed>] __dev_change_flags+0xa8/0x12b
   [<ffffffff812a148c>] dev_change_flags+0x1c/0x51
   [<ffffffff812eee8a>] devinet_ioctl+0x26e/0x5d0
   [<ffffffff812ef978>] inet_ioctl+0x8a/0xa2
   [<ffffffff8128fc28>] sock_do_ioctl+0x26/0x45
   [<ffffffff8128fe5a>] sock_ioctl+0x213/0x226
   [<ffffffff810e5988>] vfs_ioctl+0x2a/0x9d
   [<ffffffff810e5f13>] do_vfs_ioctl+0x491/0x4e2
   [<ffffffff810e5fbb>] sys_ioctl+0x57/0x7a
   [<ffffffff8100296b>] system_call_fastpath+0x16/0x1b

-> (&bonding_netdev_xmit_lock_key){+.+...} ops: 2 {
   HARDIRQ-ON-W at:
                        [<ffffffff810651ef>] __lock_acquire+0x643/0x813
                        [<ffffffff81065487>] lock_acquire+0xc8/0xed
                        [<ffffffff81335742>] _raw_spin_lock+0x31/0x66
                        [<ffffffff812b64e4>] dev_deactivate+0x96/0x195
                        [<ffffffff812a17fc>] __dev_close+0x69/0x86
                        [<ffffffff8129f8ed>] __dev_change_flags+0xa8/0x12b
                        [<ffffffff812a148c>] dev_change_flags+0x1c/0x51
                        [<ffffffff812eee8a>] devinet_ioctl+0x26e/0x5d0
                        [<ffffffff812ef978>] inet_ioctl+0x8a/0xa2
                        [<ffffffff8128fc28>] sock_do_ioctl+0x26/0x45
                        [<ffffffff8128fe5a>] sock_ioctl+0x213/0x226
                        [<ffffffff810e5988>] vfs_ioctl+0x2a/0x9d
                        [<ffffffff810e5f13>] do_vfs_ioctl+0x491/0x4e2
                        [<ffffffff810e5fbb>] sys_ioctl+0x57/0x7a
                        [<ffffffff8100296b>] system_call_fastpath+0x16/0x1b
   SOFTIRQ-ON-W at:
                        [<ffffffff81062006>] mark_held_locks+0x49/0x69
                        [<ffffffff81062139>] trace_hardirqs_on_caller+0x113/0x13e
                        [<ffffffff81062171>] trace_hardirqs_on+0xd/0xf
                        [<ffffffff81040548>] local_bh_enable+0xb2/0xba
                        [<ffffffff812a4659>] dev_queue_xmit+0x408/0x467
                        [<ffffffffa04a383f>] bond_dev_queue_xmit+0x1cd/0x1f9 [bonding]
                        [<ffffffffa04a41ee>] bond_start_xmit+0x139/0x3e9 [bonding]
                        [<ffffffff812b0e9a>] queue_process+0xa8/0x160
                        [<ffffffff81050bfb>] worker_thread+0x1af/0x2ae
                        [<ffffffff81053901>] kthread+0x7d/0x85
                        [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
   INITIAL USE at:
                       [<ffffffff8106525c>] __lock_acquire+0x6b0/0x813
                       [<ffffffff81065487>] lock_acquire+0xc8/0xed
                       [<ffffffff81335742>] _raw_spin_lock+0x31/0x66
                       [<ffffffff812b64e4>] dev_deactivate+0x96/0x195
                       [<ffffffff812a17fc>] __dev_close+0x69/0x86
                       [<ffffffff8129f8ed>] __dev_change_flags+0xa8/0x12b
                       [<ffffffff812a148c>] dev_change_flags+0x1c/0x51
                       [<ffffffff812eee8a>] devinet_ioctl+0x26e/0x5d0
                       [<ffffffff812ef978>] inet_ioctl+0x8a/0xa2
                       [<ffffffff8128fc28>] sock_do_ioctl+0x26/0x45
                       [<ffffffff8128fe5a>] sock_ioctl+0x213/0x226
                       [<ffffffff810e5988>] vfs_ioctl+0x2a/0x9d
                       [<ffffffff810e5f13>] do_vfs_ioctl+0x491/0x4e2
                       [<ffffffff810e5fbb>] sys_ioctl+0x57/0x7a
                       [<ffffffff8100296b>] system_call_fastpath+0x16/0x1b
 }
 ... key      at: [<ffffffffa04b1968>] bonding_netdev_xmit_lock_key+0x0/0xffffffffffffa78c [bonding]
 ... acquired at:
   [<ffffffff8106386d>] check_usage_backwards+0xb8/0xc7
   [<ffffffff81061d81>] mark_lock+0x311/0x54d
   [<ffffffff81062006>] mark_held_locks+0x49/0x69
   [<ffffffff81062139>] trace_hardirqs_on_caller+0x113/0x13e
   [<ffffffff81062171>] trace_hardirqs_on+0xd/0xf
   [<ffffffff81040548>] local_bh_enable+0xb2/0xba
   [<ffffffff812a4659>] dev_queue_xmit+0x408/0x467
   [<ffffffffa04a383f>] bond_dev_queue_xmit+0x1cd/0x1f9 [bonding]
   [<ffffffffa04a41ee>] bond_start_xmit+0x139/0x3e9 [bonding]
   [<ffffffff812b0e9a>] queue_process+0xa8/0x160
   [<ffffffff81050bfb>] worker_thread+0x1af/0x2ae
   [<ffffffff81053901>] kthread+0x7d/0x85
   [<ffffffff81003794>] kernel_thread_helper+0x4/0x10

stack backtrace:
Pid: 10, comm: events/1 Tainted: G        W  2.6.34-rc3 #6
Call Trace:
 [<ffffffff8106189e>] print_irq_inversion_bug+0x121/0x130
 [<ffffffff8106386d>] check_usage_backwards+0xb8/0xc7
 [<ffffffff810637b5>] ? check_usage_backwards+0x0/0xc7
 [<ffffffff81061d81>] mark_lock+0x311/0x54d
 [<ffffffff81062006>] mark_held_locks+0x49/0x69
 [<ffffffff81040548>] ? local_bh_enable+0xb2/0xba
 [<ffffffff81062139>] trace_hardirqs_on_caller+0x113/0x13e
 [<ffffffff812a4659>] ? dev_queue_xmit+0x408/0x467
 [<ffffffff81062171>] trace_hardirqs_on+0xd/0xf
 [<ffffffff81040548>] local_bh_enable+0xb2/0xba
 [<ffffffff812a4659>] dev_queue_xmit+0x408/0x467
 [<ffffffff812a435e>] ? dev_queue_xmit+0x10d/0x467
 [<ffffffffa04a383f>] bond_dev_queue_xmit+0x1cd/0x1f9 [bonding]
 [<ffffffffa04a41ee>] bond_start_xmit+0x139/0x3e9 [bonding]
 [<ffffffff812b0e9a>] queue_process+0xa8/0x160
 [<ffffffff812b0df2>] ? queue_process+0x0/0x160
 [<ffffffff81050bfb>] worker_thread+0x1af/0x2ae
 [<ffffffff81050ba2>] ? worker_thread+0x156/0x2ae
 [<ffffffff81053c34>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff81050a4c>] ? worker_thread+0x0/0x2ae
 [<ffffffff81053901>] kthread+0x7d/0x85
 [<ffffffff81003794>] kernel_thread_helper+0x4/0x10
 [<ffffffff813362bc>] ? restore_args+0x0/0x30
 [<ffffffff81053884>] ? kthread+0x0/0x85
 [<ffffffff81003790>] ? kernel_thread_helper+0x0/0x10
Dead loop on virtual device bond0, fix it urgently!
Ryan Whelan | 6 Apr 23:02 2010

Bridging vSwitches in VMwares ESXi

I'm having an issue bridging 2 virtual switches in VMwares ESXi.  I've made a post on the VMware forums describing the issue (http://communities.vmware.com/message/1507261#1507261).

I have searched the internet and found a post (http://archives.free.net.ph/message/20100108.174704.efbb18cc.ja.html) by someone having the exact same issue- it looks like that post was on this list?

In short, I have a Linux VM in ESXi with 2 vNICs- one in each of 2 different vSwitches.  A client (in my case a windows machine) on the second vSwitch can't get the MAC address of the default gateway on the first vSwitch.  Sniffing the traffic shows the arp broadcast from the windows machine making if over the linux bridge and getting responded to by the cisco gateway but the response never makes it back over the bridge.  Watching the mac table in the linux bridge shows it mistakenly associates the mac address of the windows machine to the wrong port (eth0 in my case, eth1 is the vNIC plugged into the switch with the windows box)

Im not sure where the issue is; its really pretty simple setup.  Am I missing something simple? Is there really a bug here? 

Thanks!
_______________________________________________
Bridge mailing list
Bridge <at> lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/bridge
Robert LeBlanc | 7 Apr 01:17 2010
Picon

Re: Bridging vSwitches in VMwares ESXi

On Tue, Apr 6, 2010 at 3:02 PM, Ryan Whelan <ryan.whelan <at> tbamerica.com> wrote:
>
> I'm having an issue bridging 2 virtual switches in VMwares ESXi.  I've made a post on the VMware forums
describing the issue (http://communities.vmware.com/message/1507261#1507261).
> I have searched the internet and found a post
(http://archives.free.net.ph/message/20100108.174704.efbb18cc.ja.html) by someone having the
exact same issue- it looks like that post was on this list?
> In short, I have a Linux VM in ESXi with 2 vNICs- one in each of 2 different vSwitches.  A client (in my case a
windows machine) on the second vSwitch can't get the MAC address of the default gateway on the first
vSwitch.  Sniffing the traffic shows the arp broadcast from the windows machine making if over the linux
bridge and getting responded to by the cisco gateway but the response never makes it back over the bridge.
 Watching the mac table in the linux bridge shows it mistakenly associates the mac address of the
windows machine to the wrong port (eth0 in my case, eth1 is the vNIC plugged into the switch with the windows box)
> Im not sure where the issue is; its really pretty simple setup.  Am I missing something simple? Is there
really a bug here?
> Thanks!

You do not wan to bridge in a VMWare environment, it will only drive
you to an early grave. I've blogged my experience with this problem at
http://robert.leblancnet.us/ you will need a google wave account to
view it. In short use proxy arp instead if you can.

Robert LeBlanc
Life Sciences & Undergraduate Education Computer Support
Brigham Young University

Gmane