Re: [PATCH V2] usb: gadget: f_fs: don't free buffer prematurely

From: Alan Stern
Date: Wed Mar 20 2019 - 12:52:49 EST

On Wed, 20 Mar 2019, John Stultz wrote:

> Hey Fei,
> So while this patch does resolve the issues I was seeing with
> mainline kernels and recent changes to adbd, Josh pointed out that it
> wouldn't resolve the issues I was seeing with older kernels which is
> slightly different (but still related to aio usage).
> On the older kernels I'm hitting scheduling while atomic on reboot,
> which seems to be due to ffs_aio_cancel() taking a spinlock then
> calling usb_ep_dequeue() which might sleep.
> It seems a fix for this was tried earlier with d52e4d0c0c428 ("usb:
> gadget: ffs: Fix BUG when userland exits with submitted AIO
> transfers") which was then reverted by a9c859033f6e.
> Elsewhere it seems the ffs driver takes effort to drop any locks
> before calling usb_ep_dequeue(), so this seems like that should be
> addressed, but it also seems like recent change to the dwc3 driver has
> been made to avoid sleeping in that path (see fec9095bdef4 ("usb:
> dwc3: gadget: remove wait_end_transfer")), which may be why I'm not
> seeing the problem with mainline (and your patch here, of coarse).
> But that also doesn't clarify if its still a potential issue w/
> non-dwc3 platforms.
> So for older kernels, do you have a suggestion of which approach is
> advised? Does usb_ep_dequeue need to avoid sleeping or do we need to
> rework the ffs_aio_cancel logic?

usb_ep_dequeue can be called in interrupt context, meaning it is never
allowed to sleep. This is mentioned in the kerneldoc:

* usb_ep_dequeue - dequeues (cancels, unlinks) an I/O request from an endpoint
* @ep:the endpoint associated with the request
* @req:the request being canceled
* If the request is still active on the endpoint, it is dequeued and
* eventually its completion routine is called (with status -ECONNRESET);
* else a negative error code is returned. This routine is asynchronous,
* that is, it may return before the completion routine runs.
* Note that some hardware can't clear out write fifos (to unlink the request
* at the head of the queue) except as part of disconnecting from usb. Such
* restrictions prevent drivers from supporting configuration changes,
* even to configuration zero (a "chapter 9" requirement).
* This routine may be called in interrupt context.

Alan Stern