Ying Xue | 27 Aug 10:09 2015

[PATCH] tipc: avoid packets leaking on socket receive queue

Even if we drain receive queue thoroughly in tipc_release() after tipc
socket is removed from rhashtable, it is possible that some packets
are in flight because some CPU runs receiver and did rhashtable lookup
before we removed socket. They will achieve receive queue, but nobody
delete them at all. To avoid this leak, we register a private socket
destructor to purge receive queue, meaning releasing packets pending
on receive queue will be delayed until the last reference of tipc
socket will be released.

Signed-off-by: Ying Xue <ying.xue <at> windriver.com>
---
 net/tipc/socket.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 1060d52..70ab4b7 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
 <at>  <at>  -105,6 +105,7  <at>  <at>  struct tipc_sock {
 static int tipc_backlog_rcv(struct sock *sk, struct sk_buff *skb);
 static void tipc_data_ready(struct sock *sk);
 static void tipc_write_space(struct sock *sk);
+static void tipc_sock_destruct(struct sock *sk);
 static int tipc_release(struct socket *sock);
 static int tipc_accept(struct socket *sock, struct socket *new_sock, int flags);
 static int tipc_wait_for_sndmsg(struct socket *sock, long *timeo_p);
 <at>  <at>  -381,6 +382,7  <at>  <at>  static int tipc_sk_create(struct net *net, struct socket *sock,
 	sk->sk_rcvbuf = sysctl_tipc_rmem[1];
 	sk->sk_data_ready = tipc_data_ready;
 	sk->sk_write_space = tipc_write_space;
(Continue reading)

Kolmakov Dmitriy | 27 Aug 08:57 2015

SOCK_RDM reliability

Hi All,

During my experience with TIPC in SOCK_RDM mode I was faced with message lost problem. After some digging I
figured out that the root cause of the issue is the following code in tipc/socket.c:
static int filter_rcv(struct sock *sk, struct sk_buff **skb)
{
...
/* Reject message if there isn't room to queue it */
if (sk_rmem_alloc_get(sk) + (*skb)->truesize >= limit)
return -TIPC_ERR_OVERLOAD;
...
}
It means that the received and composed message can be silently dropped on socket layer, which completely
brakes app-to-app reliability of data exchange.

I found some old discussions on this case:
http://sourceforge.net/p/tipc/mailman/message/18827243/
http://sourceforge.net/p/tipc/mailman/message/28695298/
http://article.gmane.org/gmane.network.tipc.general/2840
And it looks like this issue is well-known and persist in TIPC from very early times.
Does it mean that it is not treated as a bug? If it is planned to somehow fix this behavior?

Thanks in advance for your help,
Dmitry Kolmakov
------------------------------------------------------------------------------
Richard Alpe | 26 Aug 14:22 2015
Picon

Locking of socket receive queue

Hi Guys,

I'm looking at a NULL pointer Oops in a backport kernel. However the
same problem seems to be present in the latest kernel as well.

It happens when the user space owner is killed.

[17959.797516] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[2015-08-13 18:50:31,303] [17959.805345] IP: [<ffffffffa009741b>] tipc_release+0x26b/0x3f0 [tipc]
[2015-08-13 18:50:31,313] [17959.811714] PGD c2b973067 PUD bb08b4067 PMD
[2015-08-13 18:50:31,313] [17959.816158] Oops: 0002 [#1] SMP

Running this through GDB shows that the NULL deref is on "next->prev = prev"
below and if my GDB mojo is correct it originates from the
__skb_queue_purge(&sk->sk_receive_queue) call in tipc_release();

static inline void __skb_unlink(struct sk_buff *skb, struct sk_buff_head *list)
{
        struct sk_buff *next, *prev;

        list->qlen--;
        next       = skb->next;
        prev       = skb->prev;
        skb->next  = skb->prev = NULL;
        next->prev = prev;
        prev->next = next;
}

As the next structure is incomplete (NULL) this leads me to believe that
tipc_release() are operating on a queue at the same time someone else are
(Continue reading)

Jon Maloy | 19 Aug 08:00 2015
Picon

[PATCH net-next 0/3] tipc: fix link failover/synch problems

We fix a couple of problems with the new link failover/synch
implementation, which was introduced in this release cycle. They are
all related to situations where there is a very short interval between
the disabling and enabling of interfaces.

Jon Maloy (3):
  tipc: eliminate risk of premature link setup during failover
  tipc: interrupt link synchronization when a link goes down
  tipc: fix stale link problem during synchronization

 net/tipc/link.c |  5 +++--
 net/tipc/node.c | 27 +++++++++++++++++++++------
 2 files changed, 24 insertions(+), 8 deletions(-)

--

-- 
1.9.1

------------------------------------------------------------------------------
richard.alpe | 17 Aug 14:15 2015
Picon

[PATCH net-next] tipc: don't sanity check non-existing TLV (NL compat)

From: Richard Alpe <richard.alpe <at> ericsson.com>

A zero length payload means that no TLV (Type Length Value) data has
been passed. Prior to this patch a non-existing TLV could be sanity
checked with TLV_OK() resulting in random behavior where a user
sending an empty message occasionally got a incorrect "operation not
supported" message back.

Signed-off-by: Richard Alpe <richard.alpe <at> ericsson.com>
Reviewed-by: Erik Hugne <erik.hugne <at> ericsson.com>
---
 net/tipc/netlink_compat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c
index 53e0fee..1eadc95 100644
--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
 <at>  <at>  -1114,7 +1114,7  <at>  <at>  static int tipc_nl_compat_recv(struct sk_buff *skb, struct genl_info *info)
 	}

 	len = nlmsg_attrlen(req_nlh, GENL_HDRLEN + TIPC_GENL_HDRLEN);
-	if (TLV_GET_LEN(msg.req) && !TLV_OK(msg.req, len)) {
+	if (len && !TLV_OK(msg.req, len)) {
 		msg.rep = tipc_get_err_tlv(TIPC_CFG_NOT_SUPPORTED);
 		err = -EOPNOTSUPP;
 		goto send;
--

-- 
2.1.4

(Continue reading)

Erik Hugne | 11 Aug 10:34 2015
Picon

Illegal FSM events on link changeover

Two nodes, 1.1.1 sending fragmented SOCK_STREAM traffic to 1.1.2.
Two bearers set up, on the receiving node i down/up the underlying interface
to trigger changeover.

Then this occurs:

node2 ~ # for i in $(seq 0 10000); do ip link set vlan0 down; ip link set vlan0 
up; sleep 3; done
[  259.957202] Resetting bearer <eth:vlan0>
[  262.970160] Resetting bearer <eth:vlan0>
[  263.224736] Illegal FSM event fa110ede in state e on link 1.1.2:eth0-1.1.1:eth0
[  265.982733] Resetting bearer <eth:vlan0>
[  266.233682] Illegal FSM event fa110ede in state e on link 1.1.2:eth0-1.1.1:eth0
[  268.996920] Resetting bearer <eth:vlan0>
[  269.619152] Illegal FSM event fa110ede in state e on link 1.1.2:eth0-1.1.1:eth0
[  272.010923] Resetting bearer <eth:vlan0>
[  272.626773] Illegal FSM event fa110ede in state e on link 1.1.2:eth0-1.1.1:eth0

//E

------------------------------------------------------------------------------
Erik Hugne | 10 Aug 14:04 2015
Picon

Bclink retransmission stall

We have a retransmission/sync issue with the broadcast link.
This only seems to occur when there are loss/reordering in the network.
In my test, i launch
tipc-pipe --rdm --mc 123 --data_num=10000 --data_size=10000
on one node, and on another:
ip link set eth0 down;sleep 2; ip link set eth0 up

I have 15 VM's in my setup, each connected to a bridge using a veth device
(veth0-15). Loss/reordering are accomplished with this (on the hypervisor):
for i in $(seq 0 15); do sudo tc qdisc add dev vnet$i root netem delay 10ms 1ms 25% loss 0.1% ;done

One of the VM's (usually the node where i did the ifdown/ifup was done) will
enter a state where it's constantly requesting retransmissions.
This causes the cluster wide broadcast link to enter a stall, and it can no
longer make forward progress in sending data.

I reproduced this on v4.2-rc3-547-g1df33a1, but i think the problem is older
than this.

//E

------------------------------------------------------------------------------
richard.alpe | 7 Aug 13:38 2015
Picon

[PATCH tipc-discussion iproute2] man pages for the tipc command

From: Richard Alpe <richard.alpe <at> ericsson.com>

Hi Guys,

Here's some man pages for the new tipc command in iproute2. They are written
in groff which is easier to write than to read. I suggest you clone iproute2
merge the patches and view them in man/man8 with "$ man ./tipc.8" etc. This will
make them much easier to read and you will be able to see the layout.

Let me know what you guys think, additions and corrections are much appreciated.

Richard Alpe (1):
  tipc: add man pages

 man/man8/tipc-bearer.8    | 229 +++++++++++++++++++++++++++++++++++++++++++
 man/man8/tipc-link.8      | 242 ++++++++++++++++++++++++++++++++++++++++++++++
 man/man8/tipc-media.8     |  85 ++++++++++++++++
 man/man8/tipc-nametable.8 |  94 ++++++++++++++++++
 man/man8/tipc-node.8      |  71 ++++++++++++++
 man/man8/tipc-socket.8    |  59 +++++++++++
 man/man8/tipc.8           |  98 +++++++++++++++++++
 7 files changed, 878 insertions(+)
 create mode 100644 man/man8/tipc-bearer.8
 create mode 100644 man/man8/tipc-link.8
 create mode 100644 man/man8/tipc-media.8
 create mode 100644 man/man8/tipc-nametable.8
 create mode 100644 man/man8/tipc-node.8
 create mode 100644 man/man8/tipc-socket.8
 create mode 100644 man/man8/tipc.8

(Continue reading)

richard.alpe | 6 Aug 14:14 2015
Picon

[PATCH] tipc: don't sanity check non-existing TLV (NL compat)

From: Richard Alpe <richard.alpe <at> ericsson.com>

A zero length payload means that no TLV (Type Length Value) data has
been passed. Prior to this patch a non-existing TLV could be sanity
checked with TLV_OK() resulting in random behavior where a user
sending an empty message occasionally got a incorrect "operation not
supported" message back.

Signed-off-by: Richard Alpe <richard.alpe <at> ericsson.com>
---
 net/tipc/netlink_compat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c
index 53e0fee..1eadc95 100644
--- a/net/tipc/netlink_compat.c
+++ b/net/tipc/netlink_compat.c
 <at>  <at>  -1114,7 +1114,7  <at>  <at>  static int tipc_nl_compat_recv(struct sk_buff *skb, struct genl_info *info)
 	}

 	len = nlmsg_attrlen(req_nlh, GENL_HDRLEN + TIPC_GENL_HDRLEN);
-	if (TLV_GET_LEN(msg.req) && !TLV_OK(msg.req, len)) {
+	if (len && !TLV_OK(msg.req, len)) {
 		msg.rep = tipc_get_err_tlv(TIPC_CFG_NOT_SUPPORTED);
 		err = -EOPNOTSUPP;
 		goto send;
--

-- 
2.1.4

------------------------------------------------------------------------------
(Continue reading)

Jon Maloy | 31 Jul 00:24 2015
Picon

[PATCH net-next 00/12] tipc: separate link aggregation from link layer

We continue the work on separating the roles of the link aggregation and
link layers, as well as making code cleanups in general.

This second commit batch focuses on moving the orchestration of link
failover and synchronization to the node level, as well as preparing the
node lock structure for further future impovements. We also make some
changes to message delivery between link and socket layer, in order to
make this mechanism safer and less obscure.

Jon Maloy (12):
  tipc: eliminate function tipc_link_activate()
  tipc: move all link_reset() calls to link aggregation level
  tipc: reverse call order for link_reset()->node_link_down()
  tipc: extend node FSM
  tipc: move link synch and failover to link aggregation level
  tipc: move protocol message sending away from link FSM
  tipc: merge link->exec_mode and link->state into one FSM
  tipc: move received discovery data evaluation inside node.c
  tipc: make resetting of links non-atomic
  tipc: remove implicit message delivery in node_unlock()
  tipc: use temporary, non-protected skb queue for bundle reception
  tipc: clean up link creation

 net/tipc/bearer.c   |    4 +-
 net/tipc/core.h     |    5 +
 net/tipc/discover.c |  116 +-----
 net/tipc/link.c     | 1060 +++++++++++++++++++--------------------------------
 net/tipc/link.h     |   61 ++-
 net/tipc/msg.h      |   83 ++--
 net/tipc/node.c     |  663 ++++++++++++++++++++++++--------
(Continue reading)

Jon Maloy | 30 Jul 04:40 2015
Picon

[PATCH net-next v2 00/12] More separation between link aggregation and link layer

We continue the work on separating the roles of the link aggregation and
link layers, as well as making code cleanups in general.

This second commit batch focuses on moving the orchestration of link
failover and synchronization to the node level, as well as preparing the
node lock structure for further future impovements. We also make some
changes to message delivery between link and socket layer, in order to
make this mechanism safer and less obscure.

v2: - Replaced "rc |= ..."  with " rc = ..." at several locations.
    - #5: Some improved comments in tipc_node_check_state()
    - #5 Replaced the loop skb_queue_walk_safe() in function 
         tipc_link_rcv() with a while loop. This fixes a crash. 

Jon Maloy (12):
  tipc: eliminate function tipc_link_activate()
  tipc: move all link_reset() calls to link aggregation level
  tipc: reverse call order for link_reset()->node_link_down()
  tipc: extend node FSM
  tipc: move link synch and failover to link aggregation level
  tipc: move protocol message sending away from link FSM
  tipc: merge link->exec_mode and link->state into one FSM
  tipc: move received discovery data evaluation inside node.c
  tipc: make resetting of links non-atomic
  tipc: remove implicit message delivery in node_unlock()
  tipc: use temporary, non-protected skb queue for bundle reception
  tipc: clean up link creation

 net/tipc/bearer.c   |    4 +-
 net/tipc/core.h     |    5 +
(Continue reading)


Gmane