Sven Eckelmann | 29 Jun 23:52 2016

[PATCH maint 1/3] batman-adv: Fix orig_node_vlan leak on orig_node_release

batadv_orig_node_new uses batadv_orig_node_vlan_new to allocate a new
batadv_orig_node_vlan and add it to batadv_orig_node::vlan_list. References
to this list have also to be cleaned when the batadv_orig_node is removed.

Fixes: 21a57f6e7a3b ("batman-adv: make the TT CRC logic VLAN specific")
Signed-off-by: Sven Eckelmann <sven@...>
---
 net/batman-adv/originator.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/net/batman-adv/originator.c b/net/batman-adv/originator.c
index 7f51bc2..fe2fcda 100644
--- a/net/batman-adv/originator.c
+++ b/net/batman-adv/originator.c
 <at>  <at>  -765,6 +765,7  <at>  <at>  static void batadv_orig_node_release(struct kref *ref)
 	struct batadv_neigh_node *neigh_node;
 	struct batadv_orig_node *orig_node;
 	struct batadv_orig_ifinfo *orig_ifinfo;
+	struct batadv_orig_node_vlan *vlan;

 	orig_node = container_of(ref, struct batadv_orig_node, refcount);

 <at>  <at>  -784,6 +785,13  <at>  <at>  static void batadv_orig_node_release(struct kref *ref)
 	}
 	spin_unlock_bh(&orig_node->neigh_list_lock);

+	spin_lock_bh(&orig_node->vlan_list_lock);
+	hlist_for_each_entry_safe(vlan, node_tmp, &orig_node->vlan_list, list) {
+		hlist_del_rcu(&vlan->list);
+		batadv_orig_node_vlan_put(vlan);
(Continue reading)

Sven Eckelmann | 29 Jun 23:44 2016

[PATCH 1/2] batman-adv: Place kref_get near use of reference

It is hard to understand why the refcnt is increased when it isn't done
near the actual place the new reference is used. So using kref_get right
before the place which requires the reference and in the same function
helps to avoid accidental problems causedy incorrect reference counting.

Signed-off-by: Sven Eckelmann <sven@...>
---
 net/batman-adv/bat_iv_ogm.c            | 7 ++++---
 net/batman-adv/bat_v_ogm.c             | 5 ++---
 net/batman-adv/bridge_loop_avoidance.c | 9 ++++-----
 net/batman-adv/distributed-arp-table.c | 2 +-
 net/batman-adv/gateway_client.c        | 8 ++++++--
 net/batman-adv/hard-interface.c        | 6 ++----
 net/batman-adv/network-coding.c        | 9 ++++-----
 net/batman-adv/originator.c            | 8 ++++----
 net/batman-adv/soft-interface.c        | 4 ++++
 net/batman-adv/translation-table.c     | 6 +++---
 net/batman-adv/tvlv.c                  | 9 +++++++++
 11 files changed, 43 insertions(+), 30 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index 6af4462..cdeb48e 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
 <at>  <at>  -318,17 +318,18  <at>  <at>  batadv_iv_ogm_orig_get(struct batadv_priv *bat_priv, const u8 *addr)
 	if (!orig_node->bat_iv.bcast_own_sum)
 		goto free_orig_node;

+	kref_get(&orig_node->refcount);
 	hash_added = batadv_hash_add(bat_priv->orig_hash, batadv_compare_orig,
(Continue reading)

Sven Eckelmann | 27 Jun 01:02 2016

[PATCH v2] batman-adv: Remove orig_node reference handling from send_skb_unicast

The function batadv_send_skb_unicast is not acquiring a reference for an
orig_node nor removing it from any datastructure. It still reduces the
reference counter for an object which is still in the hands of the caller.

This is confusing and can lead to problems in the reference handling in the
caller function.

Signed-off-by: Sven Eckelmann <sven@...>
---
v2:
 - remove bogus multicast example
 - remove Fixes:

 net/batman-adv/send.c           | 22 ++++++++++++++++------
 net/batman-adv/soft-interface.c |  3 +++
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index 729deec..44be408 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
 <at>  <at>  -362,8 +362,6  <at>  <at>  int batadv_send_skb_unicast(struct batadv_priv *bat_priv,
 		ret = NET_XMIT_SUCCESS;

 out:
-	if (orig_node)
-		batadv_orig_node_put(orig_node);
 	if (ret == NET_XMIT_DROP)
 		kfree_skb(skb);
 	return ret;
(Continue reading)

Sven Eckelmann | 27 Jun 00:55 2016

[PATCH maint] batman-adv: Remove orig_node reference handling from send_skb_unicast

The function batadv_send_skb_unicast is not acquiring a reference for an
orig_node nor removing it from any datastructure. It still reduces the
reference counter for an object which is still in the hands of the caller.

This is confusing and can lead to problems in the reference handling in the
caller function. This breaks for example the reference counting in
batadv_interface_tx when an DHCP packet (BATADV_DHCP_TO_SERVER)  is
detected in a multicast (not broadcast) packet which is later detected by
the multicast code as BATADV_FORW_SINGLE. The multicast code then assumes
that it will send it and acquires the originator to the destination. But in
reality, the DHCP forwarding code takes precedence and as result the
reference counter is not decreased anymore.

Fixes: 405cc1e5a81e ("batman-adv: Modified forwarding behaviour for multicast packets")
Signed-off-by: Sven Eckelmann <sven@...>
---
 net/batman-adv/send.c           | 22 ++++++++++++++++------
 net/batman-adv/soft-interface.c |  3 +++
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index 729deec..44be408 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
 <at>  <at>  -362,8 +362,6  <at>  <at>  int batadv_send_skb_unicast(struct batadv_priv *bat_priv,
 		ret = NET_XMIT_SUCCESS;

 out:
-	if (orig_node)
-		batadv_orig_node_put(orig_node);
(Continue reading)

Sven Eckelmann | 27 Jun 08:15 2016

[PATCH v3] batman-adv: Remove orig_node reference handling from send_skb_unicast

The function batadv_send_skb_unicast is not acquiring a reference for an
orig_node nor removing it from any datastructure. It still reduces the
reference counter for an object which is still in the hands of the caller.

This is confusing and can lead in the future to problems in the reference
handling of the caller function.

Signed-off-by: Sven Eckelmann <sven@...>
---
v3:
 - adjust commit message to sound less like an fix (thanks Linus)
 - Remove " and release a reference to this orig_node" from kerneldoc of
   batadv_send_skb_unicast (thanks Linus)
v2:
 - remove bogus multicast example
 - remove Fixes:
---
 net/batman-adv/send.c           | 25 +++++++++++++++++--------
 net/batman-adv/soft-interface.c |  3 +++
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index 729deec..b4294ac 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
 <at>  <at>  -307,8 +307,7  <at>  <at>  out:
  *
  * Wrap the given skb into a batman-adv unicast or unicast-4addr header
  * depending on whether BATADV_UNICAST or BATADV_UNICAST_4ADDR was supplied
- * as packet_type. Then send this frame to the given orig_node and release a
(Continue reading)

Sven Eckelmann | 26 Jun 11:15 2016

[PATCH net 0/5] batman-adv: Fixes for Linux 4.7

Hi David,

Antonio currently seems to be occupied. This is currently rather unfortunate 
because there are patches waiting in the batman-adv development repository 
maint(enance) branch [1] since up to 6 weeks. I am now getting asked when 
these patches will hit the distribution kernels and therefore decided to 
submit these patches directly to netdev.

The patch from Simon works around the problem that warnings could be triggered 
in the translation table code via packets using a VLAN not configured on the 
target host. This warning was replaced with a rate limited info message.

Ben Hutchings found an superfluous batadv_softif_vlan_put in the error 
handling code of the translation table while he backported the "batman-adv: 
Fix reference counting of vlan object for tt_local_entry" patch to the stable 
kernels. He noticed correctly that this batadv_softif_vlan_put should also 
have been removed by the said patch.

The most requested fix at the moment is related to a double free in the 
translation table code. It is a race condition which mostly happens on systems 
with multiple cores and multiple network interface attached to batman-adv. Two 
Freifunk communities which were haunted by weird crashes (with backtraces 
reporting problems in other parts of the kernel) were kind enough to test this 
patch. They reported that there systems are now running stable after applying 
this patch.

An invalid memory access was detected in the batadv_icmp_packet_rr handling 
code when receiving a skbuff with fragments. The last patch is fixing a memory 
leak when the interface is removed via .dellink. The code to fix it was copied 
from the code handling the legacy sysfs interface to remove netdevices from a 
(Continue reading)

Sven Eckelmann | 25 Jun 16:44 2016

[PATCH v2] batman-adv: use kmem_cache for translation table

The translation table (global, local) is usually the part of batman-adv
which has the most dynamical allocated objects. Most of them
(tt_local_entry, tt_global_entry, tt_orig_list_entry, tt_change_node,
tt_req_node, tt_roam_node) are equally sized. So it makes sense to have
them allocated from a kmem_cache for each type.

This approach allowed a small wireless router (TP-Link TL-841NDv8; SLUB
allocator) to store 34% more translation table entries compared to the
current implementation.

[1] https://open-mesh.org/projects/batman-adv/wiki/Kmalloc-kmem-cache-tests

Reported-by: Linus Lüssing <linus.luessing@...>
Signed-off-by: Sven Eckelmann <sven@...>
---
 net/batman-adv/main.c              |  16 +++-
 net/batman-adv/translation-table.c | 169 +++++++++++++++++++++++++++++++++----
 net/batman-adv/translation-table.h |   3 +
 3 files changed, 169 insertions(+), 19 deletions(-)

diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index f61479b..ef07e5b 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
 <at>  <at>  -82,6 +82,12  <at>  <at>  static void batadv_recv_handler_init(void);

 static int __init batadv_init(void)
 {
+	int ret;
+
(Continue reading)

Sven Eckelmann | 24 Jun 21:43 2016

[PATCH maint] batman-adv: Avoid tt_req_node list put for unhashed entry

It can happen that a tt_req_node list entry was already removed from
tt.req_list when batadv_send_tt_request reaches the end of the function.
The reference counter was already reduced by 1 for the list entry and thus
the reference counter is not allowed to be reduced again. Otherwise, the
entry is freed too early and the next batadv_tt_req_node_put in this
function will operate on freed memory.

Fixes: cea194d90b11 ("batman-adv: improved client announcement mechanism")
Signed-off-by: Sven Eckelmann <sven@...>
---
 net/batman-adv/translation-table.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
index 3c32f5f..7e6df7a 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
 <at>  <at>  -2639,11 +2639,13  <at>  <at>  static bool batadv_send_tt_request(struct batadv_priv *bat_priv,
 out:
 	if (primary_if)
 		batadv_hardif_put(primary_if);
+
 	if (ret && tt_req_node) {
 		spin_lock_bh(&bat_priv->tt.req_list_lock);
-		/* hlist_del_init() verifies tt_req_node still is in the list */
-		hlist_del_init(&tt_req_node->list);
-		batadv_tt_req_node_put(tt_req_node);
+		if (!hlist_unhashed(&tt_req_node->list)) {
+			hlist_del_init(&tt_req_node->list);
+			batadv_tt_req_node_put(tt_req_node);
(Continue reading)

Sven Eckelmann | 23 Jun 12:53 2016

Re: Adhoc0 Interface not showing up

[Please don't remove b.a.t.m.a.n@... without
giving a reason. I nearly only replied to you instead to
the mailing list]

On Thursday 23 June 2016 16:13:11 Sailash Moirangthem wrote:
> Please check the results of the commands.

"ip link" shows that the adhoc0 interface is down. Please make
sure that the interface goes up. This is now definitely an
OpenWrt+wifi driver question and not related to batman-adv.

The interface combination [1] is showing only
"#{ managed, AP, mesh point } <= 8,". So you can try to
deactivate your AP interface in /etc/config/wireless. Make
sure that you only see the adhoc0 interface in `iw dev`
after you've restarted the wifi.

Kind regards,
	Sven

[1] https://www.kernel.org/doc/htmldocs/80211/API-struct-ieee80211-iface-combination.html
Sailash Moirangthem | 23 Jun 11:24 2016
Picon

Adhoc0 Interface not showing up

Hi Team,

I really appreciate for all your contributions in the mesh network.
I am trying to set-up mesh network using tplink WR1043ND routers and
raspberry pi B+ integrated with ra2800usb (wi pi). The tplink routers
are working as expected. Howoever, the raspberry pi is not working.

The configuration file for the Pi board is as follows:

config interface loopback
 option ifname lo
option proto static
 option ipaddr 127.0.0.1
 option netmask 255.0.0.0
config interface lan
option ifname eth0
option type bridge
 option proto static
option ipaddr 192.168.1.1
option netmask 255.255.255.0
 config interface 'mesh'
option ifname 'adhoc0'
option mtu '1528'
 option proto 'batadv'
 option mesh 'bat0'
 config interface mb
option ifname bat0
option type bridge option proto static
 option ipaddr 192.168.99.26
 option netmask 255.255.255.0
(Continue reading)

Thomas Tiotto | 22 Jun 15:49 2016

question regarding batmand convergence time

Good day,
I'm currently finishing writing my bachelor thesis whose topic is mesh
networks.  I have been running batmand and comparing it in performance
to olsrd; one metric I can't seem to be able to measure, though, is
network convergence time, i.e., the time it takes for a new node to be
discovered when connected and forgotten when disconnected from the
network.  This is due to the fact that batmand doesn't seem to display
updated information in the same way that olsrd does (with
DebugLevel=2).  I've tried looking at the routing tables but these also
aren't updated when a host disconnects and get purged after some time
has passed.
Is there any way I can get a real time view of the network to calculate
the time it takes for a host to be discovered/forgotten?

Thanks,

Thomas Tiotto
Università degli Studi di Milano


Gmane