Daniel Borkmann | 31 Aug 19:11 2015

[PATCH nf] netfilter: conntrack: use nf_ct_tmpl_free in CT/synproxy error paths

Commit 0838aa7fcfcd ("netfilter: fix netns dependencies with conntrack
templates") migrated templates to the new allocator api, but forgot to
update error paths for them in CT and synproxy to use nf_ct_tmpl_free()
instead of nf_conntrack_free().

Due to that, memory is being freed into the wrong kmemcache, but also
we drop the per net reference count of ct objects causing an imbalance.

In Brad's case, this leads to a wrap-around of net->ct.count and thus
lets __nf_conntrack_alloc() refuse to create a new ct object:

  [   10.340913] xt_addrtype: ipv6 does not support BROADCAST matching
  [   10.810168] nf_conntrack: table full, dropping packet
  [   11.917416] r8169 0000:07:00.0 eth0: link up
  [   11.917438] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
  [   12.815902] nf_conntrack: table full, dropping packet
  [   15.688561] nf_conntrack: table full, dropping packet
  [   15.689365] nf_conntrack: table full, dropping packet
  [   15.690169] nf_conntrack: table full, dropping packet
  [   15.690967] nf_conntrack: table full, dropping packet

With slab debugging, it also reports the wrong kmemcache (kmalloc-512 vs.
nf_conntrack_ffffffff81ce75c0) and reports poison overwrites, etc. Thus,
to fix the problem, export and use nf_ct_tmpl_free() instead.

Fixes: 0838aa7fcfcd ("netfilter: fix netns dependencies with conntrack templates")
Reported-by: Brad Jackson <bjackson0971 <at> gmail.com>
Signed-off-by: Daniel Borkmann <daniel <at> iogearbox.net>
(Continue reading)

Vijay Subramanian | 30 Aug 00:34 2015

[PATCH nf 1/1] nft: Fix nlmsg_type in GET operation callbacks

nf_tables_gettable(), nf_tables_getchain() and nf_tables_getrule()
send replies with nlmsg_type that correspond to ADD operation
instead of GET. Set the type correctly.

Signed-off-by: Vijay Subramanian <subramanian.vijay <at> gmail.com>
 net/netfilter/nf_tables_api.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index cfe6368..b97182a 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
 <at>  <at>  -571,7 +571,7  <at>  <at>  static int nf_tables_gettable(struct sock *nlsk, struct sk_buff *skb,
 		return -ENOMEM;

 	err = nf_tables_fill_table_info(skb2, net, NETLINK_CB(skb).portid,
-					nlh->nlmsg_seq, NFT_MSG_NEWTABLE, 0,
+					nlh->nlmsg_seq, NFT_MSG_GETTABLE, 0,
 					family, table);
 	if (err < 0)
 		goto err;
 <at>  <at>  -1136,7 +1136,7  <at>  <at>  static int nf_tables_getchain(struct sock *nlsk, struct sk_buff *skb,
 		return -ENOMEM;

 	err = nf_tables_fill_chain_info(skb2, net, NETLINK_CB(skb).portid,
-					nlh->nlmsg_seq, NFT_MSG_NEWCHAIN, 0,
+					nlh->nlmsg_seq, NFT_MSG_GETCHAIN, 0,
 					family, table, chain);
 	if (err < 0)
(Continue reading)

Jozsef Kadlecsik | 28 Aug 19:15 2015

[PATCH 0/1] ipset patch for nf

Hi Pablo,

Please apply the next bugfix patch against the nf tree.

- Dave Jones reported that KASan detected out of bounds access in hash:net*
  types, which is fixed in the next patch


The following changes since commit 18e1db67e93ed75d9dc0d34c8d783ccf10547c2b:

  netfilter: bridge: fix IPv6 packets not being bridged with CONFIG_IPV6=n (2015-08-19 21:21:41 +0200)

are available in the git repository at:

  git://blackhole.kfki.hu/nf master

for you to fetch changes up to 6fe7ccfd77415a6ba250c10c580eb3f9acf79753:

  netfilter: ipset: Out of bound access in hash:net* types fixed (2015-08-28 18:51:30 +0200)

Jozsef Kadlecsik (1):
      netfilter: ipset: Out of bound access in hash:net* types fixed

 net/netfilter/ipset/ip_set_hash_gen.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)
kadlec <at> blackhole ~/git/nf $ 
(Continue reading)

Florian Westphal | 28 Aug 00:17 2015

[PATCH -next] netfilter: nfnetlink: work around wrong endianess with old nft userspace

The nfgenmsg res_id is __be16.  Unfortunately nftables batch support uses
host byte order.

This adds a compat workaround for old nft userspace.

Suggested-by: Pablo Neira Ayuso <pablo <at> netfilter.org>
Signed-off-by: Florian Westphal <fw <at> strlen.de>
 net/netfilter/nfnetlink.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index 0c0e8ec..2e255ad 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
 <at>  <at>  -276,18 +276,24  <at>  <at>  enum {

 static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
-				u_int16_t subsys_id)
+				__be16 __subsys_id)
 	struct sk_buff *oskb = skb;
 	struct net *net = sock_net(skb->sk);
 	const struct nfnetlink_subsystem *ss;
 	const struct nfnl_callback *nc;
 	static LIST_HEAD(err_list);
+	u16 subsys_id = ntohs(__subsys_id);
 	u32 status;
 	int err;
(Continue reading)

Florian Westphal | 28 Aug 00:16 2015

[PATCH -next] netfilter: reduce sparse warnings

bridge/netfilter/ebtables.c:290:26: warning: incorrect type in assignment (different modifiers)
-> remove __pure annotation.

ipv6/netfilter/ip6t_SYNPROXY.c:240:27: warning: cast from restricted __be16
-> switch ntohs to htons and vice versa.

netfilter/core.c:391:30: warning: symbol 'nfq_ct_nat_hook' was not declared. Should it be static?
-> delete it, got removed

net/netfilter/nf_synproxy_core.c:221:48: warning: cast to restricted __be32
-> Use __be32 instead of u32.

Tested with objdiff that these changes do not affect generated code.

Signed-off-by: Florian Westphal <fw <at> strlen.de>
 net/bridge/netfilter/ebtables.c    | 2 +-
 net/ipv6/netfilter/ip6t_SYNPROXY.c | 2 +-
 net/netfilter/core.c               | 3 ---
 net/netfilter/nf_synproxy_core.c   | 6 +++---
 4 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 18ca4b2..48b6b01 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
 <at>  <at>  -176,7 +176,7  <at>  <at>  ebt_basic_match(const struct ebt_entry *e, const struct sk_buff *skb,
 	return 0;

(Continue reading)

Florian Westphal | 27 Aug 17:31 2015

nftables batch abi broken ...


batch handling in libnftnl uses this:
libnftnl/src/common.c:  nfg->res_id = NFNL_SUBSYS_NFTABLES;

libnftnl/src/common.c:  nfg->res_id = htons(NFNL_SUBSYS_NFTABLES)

since res_id is a __be16.

The kernel contains the same error when decoding batch messages
which is why this works :-/

I found this problem when looking at sparse error reports on the kernel
where sparse complains about the following line in nfnetlink_rcv():

    nfnetlink_rcv_batch(skb, nlh, nfgenmsg->res_id);

and sparse complaint is correct, __be16 is treated as u16 without

How to fix this?
If we want to maintain ABI on Little Endian only solution is to "fix"
it in kernel by annotating this with explicit cast to u16.

But it sucks since ->res_id is used via htons/ntohs in all other places

Any ideas?

(Continue reading)

Arturo Borrero Gonzalez | 27 Aug 12:57 2015

[conntrackd] allowing DisableExternalCache in alarm mode


The documentation about DisableExternalCache reads:

 You can also use this option with the NOTRACK and ALARM modes.
 This increases CPU consumption in the backup firewall but now you do not
 need to commit the flow-states during the master failures since they are
 already in the in-kernel Connection Tracking table. Moreover, you save
 memory in the backup firewall since you do not need to store the
 foreign flow-states anymore.

However, the config parser doesn't allows it. Patch seems rather trivial:

diff --git a/src/read_config_yy.y b/src/read_config_yy.y
index 73fabbf..d53aa70 100644
--- a/src/read_config_yy.y
+++ b/src/read_config_yy.y
 <at>  <at>  -908,6 +908,7  <at>  <at>  sync_mode_alarm_line: refreshtime
                         | purge
                         | relax_transitions
                         | delay_destroy_msgs
+                        | disable_external_cache


However, there seems to be some missing bits somewhere, the backup
(Continue reading)

Florian Westphal | 26 Aug 23:20 2015

[PATCH -next] Revert "netfilter: xtables: compute exact size needed for jumpstack"

This reverts commit 98d1bd802cdbc8f56868fae51edec13e86b59515.

mark_source_chains will not re-visit chains, so

:INPUT ACCEPT [365:25776]
:OUTPUT ACCEPT [217:45832]
:t1 - [0:0]
:t2 - [0:0]
:t3 - [0:0]
:t4 - [0:0]
-A t1 -i lo -j t2
-A t2 -i lo -j t3
-A t3 -i lo -j t4
# -A INPUT -j t4
# -A INPUT -j t3
# -A INPUT -j t2
-A INPUT -j t1

Will compute a chain depth of 2 if the comments are removed.
Revert back to counting the number of chains for the time being.

Reported-by: Cong Wang <cwang <at> twopensource.com>
Reported-by: Hannes Frederic Sowa <hannes <at> stressinduktion.org>
Signed-off-by: Florian Westphal <fw <at> strlen.de>
 net/ipv4/netfilter/arp_tables.c | 19 +++++++------------
 net/ipv4/netfilter/ip_tables.c  | 28 ++++++++++------------------
(Continue reading)

David Hinkle | 25 Aug 21:14 2015

How to set connmark on a socket descriptor from userspace?

I want to be able to set the connmark on a socket controlled by my
application from user space.  Is there an API to do that already?

I have been reviewing the kernel code and I can't seem to find one.
If there isn't, what would be your recommendation on the path to take
to implement such an option?  It looks like setsockopt shouldn't be
too hard to extend.  I guess extending ioctl or fcntl would be my
other options?  But I see at least one place where getsockopt already
interacts with conntrack data, handling SO_ORIGINAL_DST.

Advice, docs, and suggestions greatly appreciated.  Thank you for your time.

- David
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Cong Wang | 25 Aug 20:04 2015

WARNING at net/ipv4/netfilter/ip_tables.c:530

Hi, Florian

Your commit 98d1bd802cdbc8f56868fae51edec13e86b59515 (netfilter:
xtables: compute exact size needed for jumpstack) introduced the
following kernel warning during boot.

It looks like the if check inside mark_source_chains() isn't correct,
this causes calldepth not increased as expected, but I don't have time
to dig this.

Let me know if you need more information.

[   46.310023] ------------[ cut here ]------------
[   46.314701] WARNING: CPU: 3 PID: 719 at
net/ipv4/netfilter/ip_tables.c:530 mark_source_chains+0x16b/0x20e()
[   46.321091] CPU: 3 PID: 719 Comm: iptables Not tainted 4.2.0-rc7+ #1095
[   46.325061] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   46.328713]  0000000000000009 ffff8800d3d43c08 ffffffff81a6b5f4
[   46.334895]  0000000000000000 ffff8800d3d43c48 ffffffff8107a382
[   46.342152]  ffffffff8194c17f ffff880118b98368 0000000000002580
[   46.348474] Call Trace:
[   46.350123]  [<ffffffff81a6b5f4>] dump_stack+0x4c/0x65
[   46.353416]  [<ffffffff8107a382>] warn_slowpath_common+0x9c/0xb6
[   46.357157]  [<ffffffff8194c17f>] ? mark_source_chains+0x16b/0x20e
[   46.361080]  [<ffffffff8107a467>] warn_slowpath_null+0x1a/0x1c
[   46.364684]  [<ffffffff8194c17f>] mark_source_chains+0x16b/0x20e
[   46.368388]  [<ffffffff8194c3fd>] translate_table+0x1db/0x3a7
(Continue reading)

Daniel Borkmann | 25 Aug 15:33 2015

[PATCH conntrack-tools] conntrack: add zone direction support

This patch adds support for zone directions.

Since all options have the orig/reply as a prefix, I named it --orig-zone
and --reply-zone to stay consistent with the rest of the cmdline options.

As for the option chars, there was no unallocated reasonable combination,
thus only long options are officially exposed in the help, similarly as in
other cases.

Test suite results, after patch: OK: 79 BAD: 0

Signed-off-by: Daniel Borkmann <daniel <at> iogearbox.net>
 conntrack.8                      | 10 +++++-
 include/conntrack.h              |  2 +-
 src/conntrack.c                  | 67 ++++++++++++++++++++++++++--------------
 tests/conntrack/testsuite/04zone | 18 ++++++++++-
 4 files changed, 70 insertions(+), 27 deletions(-)

diff --git a/conntrack.8 b/conntrack.8
index abc26c5..a981a76 100644
--- a/conntrack.8
+++ b/conntrack.8
 <at>  <at>  -1,4 +1,4  <at>  <at> 
-.TH CONNTRACK 8 "Sep 25, 2014" "" ""
+.TH CONNTRACK 8 "Aug 24, 2015" "" ""

 .\" Man page written by Harald Welte <laforge <at> netfilter.org (Jun 2005)
 .\" Maintained by Pablo Neira Ayuso <pablo <at> netfilter.org (May 2007)
 <at>  <at>  -176,6 +176,14  <at>  <at>  Filter any NAT connections.
(Continue reading)