[PATCH 4.1 155/159] netlink: make sure -EBUSY wont escape from netlink_insert

From: Greg Kroah-Hartman
Date: Sat Sep 26 2015 - 17:22:37 EST

4.1-stable review patch. If anyone has any objections, please let me know.


From: Daniel Borkmann <daniel@xxxxxxxxxxxxx>

[ Upstream commit 4e7c1330689e27556de407d3fdadc65ffff5eb12 ]

Linus reports the following deadlock on rtnl_mutex; triggered only
once so far (extract):

[12236.694209] NetworkManager D 0000000000013b80 0 1047 1 0x00000000
[12236.694218] ffff88003f902640 0000000000000000 ffffffff815d15a9 0000000000000018
[12236.694224] ffff880119538000 ffff88003f902640 ffffffff81a8ff84 00000000ffffffff
[12236.694230] ffffffff81a8ff88 ffff880119c47f00 ffffffff815d133a ffffffff81a8ff80
[12236.694235] Call Trace:
[12236.694250] [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694257] [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694263] [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694271] [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694280] [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694291] [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694299] [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694309] [<ffffffff814f5ad3>] ? rtnl_getlink+0x113/0x190
[12236.694319] [<ffffffff814f202a>] ? rtnetlink_rcv_msg+0x7a/0x210
[12236.694331] [<ffffffff8124565c>] ? sock_has_perm+0x5c/0x70
[12236.694339] [<ffffffff814f1fb0>] ? rtnetlink_rcv+0x30/0x30
[12236.694346] [<ffffffff8150d62c>] ? netlink_rcv_skb+0x9c/0xc0
[12236.694354] [<ffffffff814f1f9f>] ? rtnetlink_rcv+0x1f/0x30
[12236.694360] [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694367] [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694376] [<ffffffff810a236f>] ? __wake_up+0x2f/0x50
[12236.694387] [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694396] [<ffffffff814cb05e>] ? ___sys_sendmsg+0x22e/0x240
[12236.694405] [<ffffffff814cab75>] ? ___sys_recvmsg+0x135/0x1a0
[12236.694415] [<ffffffff811a9d12>] ? eventfd_write+0x82/0x210
[12236.694423] [<ffffffff811a0f9e>] ? fsnotify+0x32e/0x4c0
[12236.694429] [<ffffffff8108cb70>] ? wake_up_q+0x60/0x60
[12236.694434] [<ffffffff814cba09>] ? __sys_sendmsg+0x39/0x70
[12236.694440] [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

It seems so far plausible that the recursive call into rtnetlink_rcv()
looks suspicious. One way, where this could trigger is that the senders
NETLINK_CB(skb).portid was wrongly 0 (which is rtnetlink socket), so
the rtnl_getlink() request's answer would be sent to the kernel instead
to the actual user process, thus grabbing rtnl_mutex() twice.

One theory would be that netlink_autobind() triggered via netlink_sendmsg()
internally overwrites the -EBUSY error to 0, but where it is wrongly
originating from __netlink_insert() instead. That would reset the
socket's portid to 0, which is then filled into NETLINK_CB(skb).portid
later on. As commit d470e3b483dc ("[NETLINK]: Fix two socket hashing bugs.")
also puts it, -EBUSY should not be propagated from netlink_insert().

It looks like it's very unlikely to reproduce. We need to trigger the
rhashtable_insert_rehash() handler under a situation where rehashing
currently occurs (one /rare/ way would be to hit ht->elasticity limits
while not filled enough to expand the hashtable, but that would rather
require a specifically crafted bind() sequence with knowledge about
destination slots, seems unlikely). It probably makes sense to guard
__netlink_insert() in any case and remap that error. It was suggested
that EOVERFLOW might be better than an already overloaded ENOMEM.

Reference: http://thread.gmane.org/gmane.linux.network/372676
Reported-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Acked-by: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
Acked-by: Thomas Graf <tgraf@xxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
net/netlink/af_netlink.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1094,6 +1094,11 @@ static int netlink_insert(struct sock *s

err = __netlink_insert(table, sk);
if (err) {
+ /* In case the hashtable backend returns with -EBUSY
+ * from here, it must not escape to the caller.
+ */
+ if (unlikely(err == -EBUSY))
+ err = -EOVERFLOW;
if (err == -EEXIST)
nlk_sk(sk)->portid = 0;

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/