Crash with SO_REUSEPORT and ef456144da8ef507c8cf504284b6042e9201a05c
From: Marc Dionne
Date: Tue Jan 19 2016 - 10:58:06 EST
I shared this one with Craig but I thought I'd put it out to a wider audience.
Trying to run the current kernel mainline on a test system I found
that any attempt to run many of our executables would crash the
system. The networking code in all of these opens and listens on
multiple UDP sockets set with SO_REUSEPORT. We also like to bind the
first socket before setting SO_REUSEPORT so we can catch some cases
where the port is actually in use by someone else (for instance a
previous incarnation of the same service).
This is easily reproduced with this sequence:
- create 2 sockets A and B
- bind socket A to an address
- set SO_REUSEPORT on socket A
- set SO_REUSEPORT on socket B
- bind socket B to the same address as A
The sk_reuseport_cb structure is only allocated at bind time if
SO_REUSEPORT is already set, so A doesn't have one. When we bind B, A
is found as a match that has SO_REUSEPORT and reuseport_add_sock will
try to use the NULL sk_reuseport_cb structure from A, causing a crash.
Not sure what the best fix is, but seems like the structure could be
either allocated (if not already done) when setting SO_REUSEPORT, or
when we find it to be NULL in reuseport_add_sock (but locking may be
an issue there). I was able to test that allocating sk_reuseport_cb
when setting SO_REUSEPORT makes things behave normally again; see
attached patch. That's surely not a correct/complete fix as B (in the
scenario above) will have an unnecessary sk_reuseport_cb which will
trigger a warning and should be dealt with.
Thanks,
Marc
commit 518120f41aec30bdde39caf58818e1e5d2d6a4a4
Author: Marc Dionne <marc.dionne@xxxxxxxxxxxx>
Date: Fri Jan 15 16:02:32 2016 -0400
soreuseport: Allow bind before setting SO_REUSEPORT
If a socket is bound before the SO_REUSEPORT option is
set, it will have no sk_reuseport_cb structure allocated
and attached to it. If there is then an attempt to bind
a second socket to the same port, the first socket will
be found as matching and reuseport_add_sock will attempt
to use the NULL sk_reuseport_cb pointer, resulting in an
oops.
Allocate sk_reuseport_cb in setsockopt if not already set.
Also, don't preclude reusing a port because the sk has
sk_reuseport_cb set.
Signed-off-by: Marc Dionne <marc.dionne@xxxxxxxxxxxx>
diff --git a/net/core/sock.c b/net/core/sock.c
index 5127023..587de5b 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -724,6 +724,8 @@ int sock_setsockopt(struct socket *sock, int level, int optname,
break;
case SO_REUSEPORT:
sk->sk_reuseport = valbool;
+ if (!rcu_access_pointer(sk->sk_reuseport_cb))
+ reuseport_alloc(sk);
break;
case SO_TYPE:
case SO_PROTOCOL:
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index dc45b53..13884b6 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -154,7 +154,6 @@ static int udp_lib_lport_inuse(struct net *net, __u16 num,
(!sk2->sk_bound_dev_if || !sk->sk_bound_dev_if ||
sk2->sk_bound_dev_if == sk->sk_bound_dev_if) &&
(!sk2->sk_reuseport || !sk->sk_reuseport ||
- rcu_access_pointer(sk->sk_reuseport_cb) ||
!uid_eq(uid, sock_i_uid(sk2))) &&
saddr_comp(sk, sk2, true)) {
if (!bitmap)
@@ -190,7 +189,6 @@ static int udp_lib_lport_inuse2(struct net *net, __u16 num,
(!sk2->sk_bound_dev_if || !sk->sk_bound_dev_if ||
sk2->sk_bound_dev_if == sk->sk_bound_dev_if) &&
(!sk2->sk_reuseport || !sk->sk_reuseport ||
- rcu_access_pointer(sk->sk_reuseport_cb) ||
!uid_eq(uid, sock_i_uid(sk2))) &&
saddr_comp(sk, sk2, true)) {
res = 1;