Re: [syzbot] general protection fault in gadget_setup
From: Alan Stern
Date: Tue Apr 13 2021 - 12:13:17 EST
On Tue, Apr 13, 2021 at 10:12:05AM +0200, Dmitry Vyukov wrote:
> On Tue, Apr 13, 2021 at 10:08 AM syzbot
> <syzbot+eb4674092e6cc8d9e0bd@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 0f4498ce Merge tag 'for-5.12/dm-fixes-2' of git://git.kern..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=124adbf6d00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=daeff30c2474a60f
> > dashboard link: https://syzkaller.appspot.com/bug?extid=eb4674092e6cc8d9e0bd
> > userspace arch: i386
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+eb4674092e6cc8d9e0bd@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> I suspect that the raw gadget_unbind() can be called while the timer
> is still active. gadget_unbind() sets gadget data to NULL.
> But I am not sure which unbind call this is:
> usb_gadget_remove_driver() or right in udc_bind_to_driver() due to a
> start error.
This certainly looks like a race between gadget_unbind and gadget_setup
in raw_gadget.
In theory, this race shouldn't matter. The gadget core is supposed to
guarantee that there won't be any more callbacks to the gadget driver
once the driver's unbind routine is called. That guarantee is enforced
in usb_gadget_remove_driver as follows:
usb_gadget_disconnect(udc->gadget);
if (udc->gadget->irq)
synchronize_irq(udc->gadget->irq);
udc->driver->unbind(udc->gadget);
usb_gadget_udc_stop(udc);
usb_gadget_disconnect turns off the pullup resistor, telling the host
that the gadget is no longer connected and preventing the transmission
of any more USB packets. Any packets that have already been received
are sure to processed by the UDC driver's interrupt handler by the time
synchronize_irq returns.
But this doesn't work with dummy_hcd, because dummy_hcd doesn't use
interrupts; it uses a timer instead. It does have code to emulate the
effect of synchronize_irq, but that code doesn't get invoked at the
right time -- it currently runs in usb_gadget_udc_stop, after the unbind
callback instead of before. Indeed, there's no way for
usb_gadget_remove_driver to invoke this code before the unbind
callback,.
I thought the synchronize_irq emulation problem had been completely
solved, but evidently it hasn't. It looks like the best solution is to
add a call of the synchronize_irq emulation code in dummy_pullup.
Maybe we can test this reasoning by putting a delay just before the call
to dum->driver->setup. That runs in the timer handler, so it's not a
good place to delay, but it may be okay just for testing purposes.
Hopefully this patch will make the race a lot more likely to occur. Is
there any way to tell syzkaller to test it, despite the fact that
syzkaller doesn't think it has a reproducer for this issue?
Alan Stern
Index: usb-devel/drivers/usb/gadget/udc/dummy_hcd.c
===================================================================
--- usb-devel.orig/drivers/usb/gadget/udc/dummy_hcd.c
+++ usb-devel/drivers/usb/gadget/udc/dummy_hcd.c
@@ -1900,6 +1900,7 @@ restart:
if (value > 0) {
++dum->callback_usage;
spin_unlock(&dum->lock);
+ mdelay(5);
value = dum->driver->setup(&dum->gadget,
&setup);
spin_lock(&dum->lock);