Re: [PATCH] usb: dwc3: fix race of usb_gadget_driver operation

From: Alan Stern
Date: Sat Jun 26 2021 - 11:12:19 EST


On Sat, Jun 26, 2021 at 09:16:25AM +0800, linyyuan@xxxxxxxxxxxxxx wrote:
> On 2021-06-26 00:37, Alan Stern wrote:
> > On Fri, Jun 25, 2021 at 06:44:15PM +0800, Linyu Yuan wrote:

> > > --- a/drivers/usb/dwc3/ep0.c
> > > +++ b/drivers/usb/dwc3/ep0.c
> > > @@ -597,10 +597,11 @@ static int dwc3_ep0_set_address(struct dwc3
> > > *dwc, struct usb_ctrlrequest *ctrl)
> > >
> > > static int dwc3_ep0_delegate_req(struct dwc3 *dwc, struct
> > > usb_ctrlrequest *ctrl)
> > > {
> > > - int ret;
> > > + int ret = 0;
> > >
> > > spin_unlock(&dwc->lock);
> > > - ret = dwc->gadget_driver->setup(dwc->gadget, ctrl);
> > > + if (dwc->async_callbacks)
> > > + ret = dwc->gadget_driver->setup(dwc->gadget, ctrl);
> > > spin_lock(&dwc->lock);
> >
> > Here and in the other places, you should test dwc->async_callbacks
> > _before_ dropping the spinlock. Otherwise there is a race (the flag
> > could be written at about the same time it is checked).
> thanks for your comments,
>
> if you think there is race here, how to make sure gadget_driver pointer is
> safe,
> this is closest place where we can confirm it is non-NULL by checking
> async_callbacks ?

I explained this twice already: We know that gadget_driver is not
NULL because usb_gadget_remove_driver calls synchronize_irq before
doing usb_gadget_udc_stop.

Look at this timing diagram:

CPU0 CPU1
---- ----
IRQ happens for setup packet
Handler sees async_callbacks
is enabled
Handler unlocks dwc->lock
usb_gadget_remove_driver runs
Disables async callbacks
Calls synchronize_irq
Handler calls dwc-> . waits for IRQ handler to
gadget_driver->setup . return
Handler locks dwc-lock .
... .
Handler returns .
. synchronize_irq returns
Calls usb_gadget_udc_stop
dwc->gadget_driver is
set to NULL

As you can see, dwc->gadget_driver is non-NULL when CPU0 uses it,
even though async_callbacks gets cleared during the time when the
lock is released.

Alan Stern