Re: net: BUG in unix_notinflight

From: Willy Tarreau
Date: Tue Mar 07 2017 - 17:33:49 EST


On Wed, Mar 08, 2017 at 12:23:56AM +0200, Nikolay Borisov wrote:
>
> >>
> >>
> >> New report from linux-next/c0b7b2b33bd17f7155956d0338ce92615da686c9
> >>
> >> ------------[ cut here ]------------
> >> kernel BUG at net/unix/garbage.c:149!
> >> invalid opcode: 0000 [#1] SMP KASAN
> >> Dumping ftrace buffer:
> >> (ftrace buffer empty)
> >> Modules linked in:
> >> CPU: 0 PID: 1806 Comm: syz-executor7 Not tainted 4.10.0-next-20170303+ #6
> >> Hardware name: Google Google Compute Engine/Google Compute Engine,
> >> BIOS Google 01/01/2011
> >> task: ffff880121c64740 task.stack: ffff88012c9e8000
> >> RIP: 0010:unix_notinflight+0x417/0x5d0 net/unix/garbage.c:149
> >> RSP: 0018:ffff88012c9ef0f8 EFLAGS: 00010297
> >> RAX: ffff880121c64740 RBX: 1ffff1002593de23 RCX: ffff8801c490c628
> >> RDX: 0000000000000000 RSI: 1ffff1002593de27 RDI: ffffffff8557e504
> >> RBP: ffff88012c9ef220 R08: 0000000000000001 R09: 0000000000000000
> >> R10: dffffc0000000000 R11: ffffed002593de55 R12: ffff8801c490c0c0
> >> R13: ffff88012c9ef1f8 R14: ffffffff85101620 R15: dffffc0000000000
> >> FS: 00000000013d3940(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
> >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 0000000001fd8cd8 CR3: 00000001cce69000 CR4: 00000000001426f0
> >> Call Trace:
> >> unix_detach_fds.isra.23+0xfa/0x170 net/unix/af_unix.c:1490
> >> unix_destruct_scm+0xf4/0x200 net/unix/af_unix.c:1499
> >
> > The problem here is there is no lock protecting concurrent unix_detach_fds()
> > even though unix_notinflight() is already serialized, if we call
> > unix_notinflight()
> > twice on the same file pointer, we trigger this bug...
> >
> > I don't know what is the right lock here to serialize it.
> >
>
>
> I reported something similar a while ago
> https://lists.gt.net/linux/kernel/2534612
>
> And Miklos Szeredi then produced the following patch :
>
> https://patchwork.kernel.org/patch/9305121/
>
> However, this was never applied. I wonder if the patch makes sense?

I don't know but there's a hint at the bottom of the thread with
Davem's response to which there was no followup :

"Why would I apply a patch that's an RFC, doesn't have a proper commit
message, lacks a proper signoff, and also lacks ACK's and feedback
from other knowledgable developers?"

So at least this point makes sense, maybe the patch is fine but was
not sufficiently reviewed or acked ? Maybe it was proposed as an RFC
to start a discussion and never went to the final status of a patch
waiting for being applied ?

Willy