Re: [PATCH] usb: gadget: udc: udc_stop before gadget unbind

From: Felipe Balbi
Date: Tue Aug 15 2017 - 08:02:57 EST



Hi,

Danilo Krummrich <danilokrummrich@xxxxxxxxxxxxx> writes:
> thanks for reviewing.

np :-)

> On 2017-08-15 12:03, Felipe Balbi wrote:
>> Hi,
>>
>> Danilo Krummrich <danilokrummrich@xxxxxxxxxxxxx> writes:
>>> udc_stop needs to be called before gadget driver unbind. Otherwise it
>>> might happen that udc drivers still call into the gadget driver (e.g.
>>> to reset gadget after OTG event). If this happens it is likely to get
>>> panics from gadget driver dereferencing NULL ptr, as gadget's drvdata
>>> is set to NULL on unbind.
>>
>> seems like the problem here is with the OTG layer, not UDC core.
>>
> I mentioned this just as example, it can happen whenever a UDC driver
> calls
> the gadget driver (e.g. by calling usb_gadget_udc_reset() in ISR) after
> gadget
> drivers unbind() was called already (e.g. by gadget configfs).
> If this happens gadget drivers drvdata was already set to NULL by
> unbind()
> and reset() could result into a NULL ptr exception.
> Therefore my assumption was that it needs to be prevented that the
> gadget
> driver is getting called after unbind.

We have a known problem in the design of the gadget API that causes this
races but we couldn't come up with a solution yet :-)

Inverting these two calls is not the correct way to go about this :-)

>>> Signed-off-by: Danilo Krummrich <danilokrummrich@xxxxxxxxxxxxx>
>>> ---
>>> Actually there could still be a race:
>>> (CPU1 code taken from dwc3 drivers dwc3_disconnect_gadget() as
>>> exsample)
>>>
>>> CPU0 CPU1
>>> ---- ----
>>> usb_gadget_disconnect(udc->gadget);
>>> udc->driver->disconnect(udc->gadget);
>>> if (dwc->gadget_driver && dwc->gadget_driver->disconnect)
>>> usb_gadget_udc_stop(udc);
>>> udc->driver->unbind(udc->gadget);
>>> dwc->gadget_driver->disconnect(&dwc->gadget);
>>>
>>> UDC drivers typically set their gadget driver pointer to NULL in
>>> udc_stop
>>> and check for it before calling into the gadget driver. To fix the
>>> issue
>>> above every udc driver could apply a lock around this.
>>>
>>> If you see the need for having this or another solutions I can provide
>>> further patches. This patch could also just serve as a base for
>>> discussion
>>> if someone knows a smarter solution.
>>>
>>> I saw this problem causing a panic on hikey960 board and provided a
>>> quick
>>> workaround for the same problem here:
>>> https://android-review.googlesource.com/#/c/kernel/common/+/457476/
>>> (panic log in the commit message of the linked patch)
>>> ---
>>> drivers/usb/gadget/udc/core.c | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/usb/gadget/udc/core.c
>>> b/drivers/usb/gadget/udc/core.c
>>> index efce68e9a8e0..8155468afc0d 100644
>>> --- a/drivers/usb/gadget/udc/core.c
>>> +++ b/drivers/usb/gadget/udc/core.c
>>> @@ -1234,8 +1234,12 @@ static void usb_gadget_remove_driver(struct
>>> usb_udc *udc)
>>>
>>> usb_gadget_disconnect(udc->gadget);
>>> udc->driver->disconnect(udc->gadget);
>>> - udc->driver->unbind(udc->gadget);
>>> + /* udc_stop needs to be called before gadget driver unbind to
>>> prevent
>>> + * udc driver calls into gadget driver after unbind which could
>>> cause
>>> + * a nullptr exception.
>>> + */
>>> usb_gadget_udc_stop(udc);
>>> + udc->driver->unbind(udc->gadget);
>>
>> This patch is incorrect, it will prevent us from giving back requests
>> to
>> gadget driver properly. ->unbind() has to happen before ->udc_stop().
>
> Do you mean after udc_stop the udc driver can not call the gadget driver
> anymore? If not, I did not got your point, sorry for that. Can you
> please
> help me out? Would the changed order raise another issue I'm not aware
> of?

right, ->udc_stop() is supposed to completely teardown the USB
controller, including disabling interrupts and all. The only thing it
_can_ do from ->udc_stop() would be giving back any pending requests
that were left (which would cause req->complete() to be called with an
error status). But even that is unlikely in the case you mention since
->unbind() was already called.

> If I understood you correctly, without this patch udc driver can not
> call
> the gadget driver back as well, because this would result in a NULL ptr
> dereference, as unbind() sets drvdata to NULL.
>
> In any case the race described in my original message can still happen,
> regardless of the order of udc_stop and unbind. But with this patch the
> needed locking could easily done within the udc driver only. Without,
> the
> lock needs to be acquired before udc->driver->unbind(udc->gadget) and
> released after usb_gadget_udc_stop(). Otherwise an ISR of the udc driver
> trying to call into the gadget driver could do this after gadget driver
> already unbound.

right

--
balbi

Attachment: signature.asc
Description: PGP signature