Re: kernel BUG at net/unix/garbage.c:149!"

From: Miklos Szeredi
Date: Tue Aug 30 2016 - 05:18:20 EST


On Tue, Aug 30, 2016 at 12:37 AM, Miklos Szeredi <mszeredi@xxxxxxxxxx> wrote:
> On Sat, Aug 27, 2016 at 11:55 AM, Miklos Szeredi <mszeredi@xxxxxxxxxx> wrote:

> crash> list -H gc_inflight_list unix_sock.link -s unix_sock.inflight |
> grep counter | cut -d= -f2 | awk '{s+=$1} END {print s}'
> 130
> crash> p unix_tot_inflight
> unix_tot_inflight = $2 = 135
>
> We've lost track of a total of five inflight sockets, so it's not a
> one-off thing. Really weird... Now off to sleep, maybe I'll dream of
> the solution.

Okay, found one bug: gc assumes that in-flight sockets that don't have
an external ref can't gain one while unix_gc_lock is held. That is
true because unix_notinflight() will be called before detaching fds,
which takes unix_gc_lock. Only MSG_PEEK was somehow overlooked. That
one also clones the fds, also keeping them in the skb. But through
MSG_PEEK an external reference can definitely be gained without ever
touching unix_gc_lock.

Not sure whether the reported bug can be explained by this. Can you
confirm the MSG_PEEK was used in the setup?

Does someone want to write a stress test for SCM_RIGHTS + MSG_PEEK?

Anyway, attaching a fix that works by acquiring unix_gc_lock in case
of MSG_PEEK also. It is trivially correct, but I haven't tested it.

Thanks,
Miklos
From: Miklos Szeredi <mszeredi@xxxxxxxxxx>
Subject: af_unix: fix garbage collect vs. MSG_PEEK

Gc assumes that in-flight sockets that don't have an external ref can't
gain one while unix_gc_lock is held. That is true because
unix_notinflight() will be called before detaching fds, which takes
unix_gc_lock.

Only MSG_PEEK was somehow overlooked. That one also clones the fds, also
keeping them in the skb. But through MSG_PEEK an external reference can
definitely be gained without ever touching unix_gc_lock.

Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
---
include/net/af_unix.h | 1 +
net/unix/af_unix.c | 15 +++++++++++++--
net/unix/garbage.c | 6 ++++++
3 files changed, 20 insertions(+), 2 deletions(-)

--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -10,6 +10,7 @@ void unix_inflight(struct user_struct *u
void unix_notinflight(struct user_struct *user, struct file *fp);
void unix_gc(void);
void wait_for_unix_gc(void);
+void unix_gc_barrier(void);
struct sock *unix_get_socket(struct file *filp);
struct sock *unix_peer_get(struct sock *);

--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1563,6 +1563,17 @@ static int unix_attach_fds(struct scm_co
return max_level;
}

+static void unix_peek_fds(struct scm_cookie *scm, struct sk_buff *skb)
+{
+ scm->fp = scm_fp_dup(UNIXCB(skb).fp);
+ /*
+ * During garbage collection it is assumed that in-flight sockets don't
+ * get a new external reference. So we need to wait until current run
+ * finishes.
+ */
+ unix_gc_barrier();
+}
+
static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb, bool send_fds)
{
int err = 0;
@@ -2195,7 +2206,7 @@ static int unix_dgram_recvmsg(struct soc
sk_peek_offset_fwd(sk, size);

if (UNIXCB(skb).fp)
- scm.fp = scm_fp_dup(UNIXCB(skb).fp);
+ unix_peek_fds(&scm, skb);
}
err = (flags & MSG_TRUNC) ? skb->len - skip : size;

@@ -2435,7 +2446,7 @@ static int unix_stream_read_generic(stru
/* It is questionable, see note in unix_dgram_recvmsg.
*/
if (UNIXCB(skb).fp)
- scm.fp = scm_fp_dup(UNIXCB(skb).fp);
+ unix_peek_fds(&scm, skb);

sk_peek_offset_fwd(sk, chunk);

--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -266,6 +266,12 @@ void wait_for_unix_gc(void)
wait_event(unix_gc_wait, gc_in_progress == false);
}

+void unix_gc_barrier(void)
+{
+ spin_lock(&unix_gc_lock);
+ spin_unlock(&unix_gc_lock);
+}
+
/* The external entry point: unix_gc() */
void unix_gc(void)
{