Stephens, Allan | 2 Nov 15:19 2009

TIPC WG: Minutes of October 28, 2009 Meeting

A phone meeting of the TIPC working group was held on October 28, 2009.

Attendees:

Al Stephens (Wind River) -- meeting chair; Jon Maloy (Ericsson); Andrew
Booth (Performance Technologies)

Minutes:

1. Meeting opening

The meeting was called to order at 2:06 PM (EDT).

2. TIPC bug board review

There was no significant progress reported on any of the following
issues:

- Name table inconsistency issue
- Ethernet FIFO full issue
- Locking dependency issue
- Link misordering issue
- Broadcast link issues

In addition, a new issue involving problems with link session numbers
has been recently reported by Laser.  Fortunately, he also appears to
have discovered the cause and created a patch, which Jon has reviewed.
It is unclear if this issue & patch are related to the link issues
reported by Osamu and Surya, but it is a possibility.

(Continue reading)

mail gd | 12 Nov 04:40 2009
Picon

TIPC - Link Reset query

Hi,

I am new to TIPC and have written a couple of linux applications to
communicate over TIPC. These two continue to run well on two different nodes
but sometimes a random message is observed:

*Sep 21 15:02:34 (none) kernel: [   21.296359] TIPC: Established link
<1.1.5:bond0-1.1.3:bond0> on network plane A
Sep 21 15:02:34 (none) kernel: [   21.673037] TIPC: Resetting link
<1.1.5:bond0-1.1.3:bond0>, requested by peer
Sep 21 15:02:34 (none) kernel: [   21.678581] TIPC: Lost link
<1.1.5:bond0-1.1.3:bond0> on network plane A
Sep 21 15:02:34 (none) kernel: [   21.685231] TIPC: Lost contact with
<1.1.3>
Sep 21 15:02:34 (none) kernel: [   21.695191] TIPC: Established link
<1.1.5:bond0-1.1.3:bond0> on network plane A
Sep 21 15:02:39 (none) kernel: [   25.604125] Resetting :timed out status:84
ctl:d8

*As per the logs above, TIPC re-establishes the link after losing contact
with the other node. However, the applications which were running earlier
seem to experience issues as the sender application finds send() failiing
with errno 107 (ENOTCONN).

I just want to know:

1. The possible cause of link reset.
2. Why the application experience problems even when the link gets
re-established in the TIPC layer?

(Continue reading)

Stephens, Allan | 12 Nov 16:30 2009

Re: TIPC - Link Reset query

GD wrote:
> I am new to TIPC and have written a couple of linux 
> applications to communicate over TIPC. These two continue to 
> run well on two different nodes but sometimes a random 
> message is observed:
> 
> *Sep 21 15:02:34 (none) kernel: [   21.296359] TIPC: Established link
> <1.1.5:bond0-1.1.3:bond0> on network plane A
> Sep 21 15:02:34 (none) kernel: [   21.673037] TIPC: Resetting link
> <1.1.5:bond0-1.1.3:bond0>, requested by peer
> Sep 21 15:02:34 (none) kernel: [   21.678581] TIPC: Lost link
> <1.1.5:bond0-1.1.3:bond0> on network plane A
> Sep 21 15:02:34 (none) kernel: [   21.685231] TIPC: Lost contact with
> <1.1.3>
> Sep 21 15:02:34 (none) kernel: [   21.695191] TIPC: Established link
> <1.1.5:bond0-1.1.3:bond0> on network plane A
> Sep 21 15:02:39 (none) kernel: [   25.604125] Resetting 
> :timed out status:84
> ctl:d8
> 
> *As per the logs above, TIPC re-establishes the link after 
> losing contact with the other node. However, the applications 
> which were running earlier seem to experience issues as the 
> sender application finds send() failiing with errno 107 (ENOTCONN).
> 
> I just want to know:
> 
> 1. The possible cause of link reset.

The second message indicates that node <1.1.3> told node <1.1.5> to
(Continue reading)

Stephens, Allan | 12 Nov 17:52 2009

Re: TIPC - Link Reset query

Hi GD:

>From a quick look at the TIPC 1.7 code base, it looks like reset
messages are initially sent when a link is initially established and
thereafter only if one end of the link loses contact with the other end
(i.e. it stops receiving traffic and link state messages sent by the
other end).  You might want to try increasing the link tolerance of your
links to see if this helps keep things alive longer.

Please be advised that people have reported issues with the use of
bonded interfaces in the past, so you might want to re-examine the
motivation for that requirement.  As I understand it, bonded interfaces
are typically used to provide redundancy, so the failure of a single
Ethernet interface doesn't cause communication between nodes to be lost.
You can achieve this kind of redundancy using TIPC (without having to
utilize bonded interfaces) by configuring bearers over a pair of
Ethernet interfaces on each node and letting TIPC take care of
establishing links over each interface and switching traffic from one
link to the other as needed.

Regards,
Al

________________________________

	From: mail gd [mailto:mailgd83 <at> gmail.com] 
	Sent: Thursday, November 12, 2009 11:11 AM
	To: Stephens, Allan
	Subject: Re: [tipc-discussion] TIPC - Link Reset query
	
(Continue reading)

Randy MacLeod | 13 Nov 05:43 2009
Picon

strace & tipc

I've started to patch strace to understand tipc. Has anyone done this yet?

So this mess:

./strace /tmp/tipcTS
...
socket(0x1e /* PF_??? */, SOCK_RDM, 0)  = 3
bind(3, {sa_family=0x1e /* AF_??? */,
sa_data="\2\1H\0\0\0\350\3\0\0\0\0\0\0"}, 16) = 0

becomes:
./strace /tmp/tipcTS
...
socket(PF_TIPC, SOCK_RDM, 0)            = 3
bind(3, {sa_family=AF_TIPC, addrtype=NAME, scope=ZONE}, 16) = 0

Note that I haven't even finished unpacking the address properly.

Initial patch attached applies to strace.sf.net git tree with commit
46ed50d56909843420b0a0cb1360a500ce421d52
at the head.
--

-- 
../Randy/..
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
(Continue reading)

Holger brunck | 23 Nov 10:00 2009

TIPC Buffersize for connectionless messages

Hi all,
for node communication we use TIPC with connectionless messages between tipc
ports. If we execute sendto we can not be sure that the message is successfully
send. On the other hand we do not want to use connection-orientated sockets on a
single cpu, because we think that the overhead (connect, send, disconnect) for a
single message is to big.

Is there a chance to get informations about contents of the TIPC buffer (global
and per TIPC port) from userspace via an IOCTL call or something like that?

Best regards
Holger Brunck

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
Stephens, Allan | 23 Nov 15:08 2009

Re: TIPC Buffersize for connectionless messages

Hi Holger:

If you call sendto() and it returns a non-negative value, then you know
that the message you were attempting to send has been sent (i.e. TIPC
was able to create the message and dispatch it towards the specified
destination).  All your application needs to do is to check the return
value of the sendto() call.

If you want to be sure that the message was successfully *received*,
that's a different question.  If you use SOCK_RDM-type sockets, TIPC
will attempt to return a message to the sender if it cannot be added to
the destination's socket receive queue, or if the destination closes its
socket before receiving the message.  See section 1.5.6 of the TIPC 1.7
Programmer's Guide
(http://tipc.sourceforge.net/doc/tipc_1.7_prog_guide.html) for more info
on message rejection and message return.

Regards,
Al 

> -----Original Message-----
> From: Holger brunck [mailto:holger.brunck <at> keymile.com] 
> Sent: Monday, November 23, 2009 4:01 AM
> To: tipc-discussion <at> lists.sourceforge.net
> Subject: [tipc-discussion] TIPC Buffersize for connectionless messages
> 
> Hi all,
> for node communication we use TIPC with connectionless 
> messages between tipc ports. If we execute sendto we can not 
> be sure that the message is successfully send. On the other 
(Continue reading)

Brunck, Holger | 25 Nov 10:18 2009

Re: TIPC Buffersize for connectionless messages

Hi Allan, 
you are right. If sendto() returns with a positive value the message was
successfully created. But the destination socket can drop the message
when his queue is full. That's ok and there is no other way for node to
node communication. 

But if I want to send messages between TIPC ports on one node there
could be other ways. During the sendto() call we got all information
about the situation of the receiving socket and the return of the
sendto() could indicate that the message has reached it's destination
receive queue. 

We got a system with approximately 100 TIPC ports. These ports are full
meshed, so every port is able to send to every other port in the system.
And if we switch to connection orientated communication we think the
overhead with connect, accept, disconnect would be to big for every
message. Additionally we don't want to make our communication
synchronous. 

Maybe there is another way to get information about the receive queue,
if the destination is on the own node? 

Best regards 
Holger Brunck 

-----Original Message-----
From: Stephens, Allan [mailto:allan.stephens <at> windriver.com] 
Sent: Monday, November 23, 2009 3:08 PM
To: Brunck, Holger; tipc-discussion <at> lists.sourceforge.net
Subject: RE: [tipc-discussion] TIPC Buffersize for connectionless
(Continue reading)

Stephens, Allan | 25 Nov 15:17 2009

Re: TIPC Buffersize for connectionless messages

Hi Holger:

One of the main principles underlying TIPC is the fact that messaging is
"transparent" with respect to the location of ports within the network.
This means that we want the TIPC API to work the same way for
communication between ports on the same node as it does for
communication between ports on different nodes.  Consequently, because
sendto() can't indicate that a message has been successfully placed in
the destination socket's receive queue in the latter case, we don't want
it to provide this indication in the former case.

If you really want to know whether a message you have sent was
successfully added to the receiver's queue, I think the best you can do
is to use SOCK_RDM sockets and do the following:

1) have the sending part of your application call sendto() to send each
message
2) have the receiving part of your application call recv() [or
recvfrom() or recvmsg()] to handle incoming messages; if you get a
return code of 0 for a receive, this means the associated message was
rejected by the receiver rather than being a normal message sent by
another port.

Depending on how your application is designed, you may find it handy to
use the MSG_DONTWAIT and MSG_PEEK flags on the receive side of things to
allow you to distinguish between normal incoming messages and returned
messages.

Alternatively, you might want to use a pair of ports (instead of just
one) for each communication endpoint, and use one port for sending and
(Continue reading)

Stephens, Allan | 26 Nov 14:44 2009

REMINDER - TIPC Working Group Meeting on Thursday [TODAY]

The next meeting of the TIPC Working Group is scheduled for Thursday,
November 26, 2009 at 1:30 PM EST.

AGENDA:
- TIPC bug board review of open issues

Additional topics for discussion may also be raised at the meeting
itself. It would be appreciated if any background material about such
topics was distributed prior to the meeting so it can be considered in
advance.

CONFERENCE CALL INFO:

Dial-in number: 1-866-869-3090 (North America, toll-free)
                outside North America, contact Al Stephens

Conference code: 6132702259 then #

Regards,
Al Stephens
TIPC WG chair

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

Gmane