Lukas Razik | 1 Mar 19:55 2008

Re: TIPC over InfiniBand - is that possible?

Hi!

 > It's not possible to put TIPC directly on InfiniBand (IB) *now*
 > because no TIPC release supports IB bearers.

O.K.

 > Someone would have to write the IB bearer code for TIPC.
 > It's open source, you're free to do that yourself or
 > look for someone else to do the work.

I only was interested because I haven't had a closer look to the sources 
of the IB interface but I've seen (in inetdevice.h) that the struct 
in_device is also based on the net_device struct which is the base for 
the etherdevices...

 > also...
 > If you have (or create) an ethernet over IB driver, then
 > you could bind the existing TIPC stack to that.

You must be a fortune teller. ;-)
I've developed an Ethernet over Sockets driver which should also work 
with IB cards (I didn't test it). It's stable but the performance must 
be better for further usage and because of that I must redesign some 
things...

 > What's your intended use of TIPC/IB - just curious.

I work at the Chair for Operating Systems at the University of Aachen 
(Germany) in a team which tries to run (for example) the Kerrighed SSI 
(Continue reading)

Randy MacLeod | 2 Mar 00:29 2008
Picon

Re: TIPC over InfiniBand - is that possible?

Hi,

On Sat, Mar 1, 2008 at 1:55 PM, Lukas Razik <linux <at> razik.name> wrote:
>   > What's your intended use of TIPC/IB - just curious.
>
>  I work at the Chair for Operating Systems at the University of Aachen
>  (Germany) in a team which tries to run (for example) the Kerrighed SSI
>  cluster project over other than the Ethernet network cards.

Cool! I've been thinking for years that TIPC could be helpful
for a distributed SSI cluster.

--

-- 
// Randy

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
Lukas Razik | 2 Mar 11:43 2008

Re: TIPC over InfiniBand - is that possible?

Hello Randy!

>> I work at the Chair for Operating Systems at the University of Aachen
>> (Germany) in a team which tries to run (for example) the Kerrighed SSI
>> cluster project over other than the Ethernet network cards.
> 
> 
> Cool! I've been thinking for years that TIPC could be helpful
> for a distributed SSI cluster.

Yes, it's really cool because at first the Kerrighed people had their 
own communication layer based on the Ethernet interfaces. But then they 
decided to use a RPC API which runs over TIPC ( 
http://www.kerrighed.org/wiki/index.php/RPCAPI ) and I also think that's 
  already because of the compatibility with newer kernels a very good way.

_Maybe_ if we won't have good performance results with the Ethernet over 
"other bearers" driver, we will try to implement another network media 
type for TIPC but that's a decission of my team leader...

Best regards,
Lukas

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
Matthias Kaehlcke | 2 Mar 19:36 2008
Picon

[PATCH] TIPC Protocol: Convert tsock->sem in a mutex

TIPC Protocol: The semaphore tsock->sem is used as mutex, convert it
to the mutex API

Signed-off-by: Matthias Kaehlcke <matthias <at> kaehlcke.net>

--

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 2290903..9ae8e9f 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
 <at>  <at>  -43,7 +43,7  <at>  <at> 
 #include <linux/slab.h>
 #include <linux/poll.h>
 #include <linux/fcntl.h>
-#include <asm/semaphore.h>
+#include <linux/mutex.h>
 #include <asm/string.h>
 #include <asm/atomic.h>
 #include <net/sock.h>
 <at>  <at>  -63,7 +63,7  <at>  <at> 
 struct tipc_sock {
 	struct sock sk;
 	struct tipc_port *p;
-	struct semaphore sem;
+	struct mutex lock;
 };

 #define tipc_sk(sk) ((struct tipc_sock*)sk)
 <at>  <at>  -217,7 +217,7  <at>  <at>  static int tipc_create(struct net *net, struct socket *sock, int protocol)
(Continue reading)

Xpl++ | 2 Mar 22:18 2008
Picon

Link related question/issue

Hi everybody,

  I am experiencing RX packet drops sometimes and it seems that TIPC has 
some hard time dealing with that. When I looked in tipc_link.c I found 
the following comment:

/* TODO: Implement stronger sequence # checking someday ... */

and .. then I got a bit worried :)
What exactly is the expected behavior in case of droped eth packets?
As soon as the nic drops even few packets, the TIPC stack becomes .. 
well completely unpredictable. In 80% of the cases a disable/enable of 
the bearer helps, but there are those 20% when only complete reboot 
solves the problem. Also affected links never timeout/reset on their 
own. I am working on resolving the packet drop issue in general, but 
that's not a long term solution, neither is trying to detect if things 
went wrong. It's also worth noting that those 20% of cases tend to 
happen when my apps try to transfer 2-3MB or more in series of 16K 
send()s over a SOCK_SEQPACKET socket.

About the setup:
* TIPC network of 20+ nodes, Intel/AMD, 32bit, SMP all of them
* NICs are either e1000 or tg3 driven, working over a gbit full duplex 
link, all nodes are connected to a 24 port gbit d-link switch
* TIPC is 1.7.5 with two of the 4 posted/available patches:
  - Prevent-premature-discarding-of-messages-during-fragment-reassembly
  - Fix-port-related-bugs-arising-when-TIPC-network-address-is-assigned
* issue exists with both 2.6.20.4 and 2.6.24.2 kernels
* tested with NIC qlen of 1000 - 8192, tipc link window up to 4096 packets
* under normal conditions max send queue stays within the 60-100 range.
(Continue reading)

Florian Westphal | 2 Mar 23:16 2008
Picon

Re: [PATCH] TIPC Protocol: Convert tsock->sem in a mutex

Matthias Kaehlcke <matthias <at> kaehlcke.net> wrote:
> TIPC Protocol: The semaphore tsock->sem is used as mutex, convert it
> to the mutex API

The locking mechanism was re-done in TIPC 1.7.X, and
AFAIK Allan Stephens is working on submitting
the current TIPC version for inclusion in 2.6.26.
So it probably doesn't make much sense to merge this patch
(even though it looks correct).

Sorry.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
Xpl++ | 3 Mar 15:45 2008
Picon

Re: Link related question/issue

Hi Jon,

Read below ..

Jon Paul Maloy ??????:
> This scenario can easily be confirmed by doing a
> wireshark dump when a link goes stale. If you even
> could catch the moment when it happens, i.e. see the
> packet that is never delivered, we would have a good
> clue to follow.
>   
> Wireshark... 
>   
Well, I managed to do a wireshark dump and force the link to .. fail or 
whatever :)
The sequence of events looks like this:

A > B: fragmenter: first
A > B: fragmenter: fragment * 10
A > B: fragmenter: last

... this repeats several times with 2 * (B > A: link state: state) 
almost always present somewhere within that 12 pack sequence ... and 
then ...

A > B: fragmenter: first
A > B: fragmenter: fragment * 8 [note it probably should have been 10, 
as 95% ot fragmented packets in my cluster are 16K sent over mtu of 1500]
B > A: payld: norej: directmsg [this is an unrelated communication by 
other threads]
(Continue reading)

Jon Maloy | 3 Mar 17:17 2008
Picon

Re: Link related question/issue

It would be better if you attach the real Wireshark dump.
I need to see sequence numbers, gaps etc.

///jon

Xpl++ wrote:
> Hi Jon,
>
> Read below ..
>
> Jon Paul Maloy ??????:
>   
>> This scenario can easily be confirmed by doing a
>> wireshark dump when a link goes stale. If you even
>> could catch the moment when it happens, i.e. see the
>> packet that is never delivered, we would have a good
>> clue to follow.
>>   
>> Wireshark... 
>>   
>>     
> Well, I managed to do a wireshark dump and force the link to .. fail or 
> whatever :)
> The sequence of events looks like this:
>
> A > B: fragmenter: first
> A > B: fragmenter: fragment * 10
> A > B: fragmenter: last
>
> ... this repeats several times with 2 * (B > A: link state: state) 
(Continue reading)

Jon Maloy | 4 Mar 01:57 2008
Picon

Re: Link related question/issue

Hi,
I looked briefly at your dump with my new Wireshark.
It seems like things start to go wrong at packet
14191, where we suddenly lose 95 packets, and jump
from seq no 53888 to 53983.

Strangely enough, node 1.1.12 continues to ack packets
which we don't see in wireshark (is it possible that
wireshark can miss packets?). It goes on acking packets
up to the one with sequence number 53967, (on of the
"invisible" packets, but from there on it is stop.

1.1.6 continues to wreak traffic for another while,
up to packet seq no 54991 (packet 14773), but then
it is stop even there.
After this point, there is only State messages going
from 1.1.6 to 1.1.12, while traffic runs normally
in the opposite direction.

There seems to never be a request for retransmission 
(sequence gap is always 0) in the State messages sent
out from 1.1.12 to 1.1.6, as there should be. This may
mean that TIPC never receives any of the packets we see,
from 53983 and on, and hence never has a chance to detect
a gap.

Only a bearer reset can resolve this situation,
which seems to be your case.

As a sum of this, I start to suspect your Ethernet 
(Continue reading)

David Miller | 4 Mar 08:37 2008
Picon

Re: [PATCH] TIPC Protocol: Convert tsock->sem in a mutex

From: Matthias Kaehlcke <matthias <at> kaehlcke.net>
Date: Sun, 2 Mar 2008 19:36:37 +0100

> TIPC Protocol: The semaphore tsock->sem is used as mutex, convert it
> to the mutex API
> 
> Signed-off-by: Matthias Kaehlcke <matthias <at> kaehlcke.net>

Applied, thanks a lot.

Yes, I know the TIPC folks said they have a bunch of stuff coming.

But too bad, the TIPC folks haven't been able to submit patches
properly lately and I'm not going to penalize someone like Matthias
who can submit a clean, correct, and proper patch.

Gmane