Re: [PATCH v1 0/3] usb: gadget: uvc: stability fixes on STREAMOFF.

From: Michael Grzeschik
Date: Sun Oct 08 2023 - 15:48:22 EST


On Fri, Oct 06, 2023 at 04:48:19PM -0700, Avichal Rakesh wrote:
On 10/6/23 15:53, Michael Grzeschik wrote:
On Fri, Oct 06, 2023 at 10:00:11AM -0700, Avichal Rakesh wrote:


On 10/5/23 15:05, Michael Grzeschik wrote:
Hi Avichal,

On Thu, Oct 05, 2023 at 11:30:32AM -0700, Avichal Rakesh wrote:
On 10/5/23 03:14, Michael Grzeschik wrote:
On Thu, Oct 05, 2023 at 11:23:27AM +0300, Laurent Pinchart wrote:
On Tue, Oct 03, 2023 at 01:09:06PM +0200, Michael Grzeschik wrote:
On Sat, Sep 30, 2023 at 11:48:18AM -0700, Avichal Rakesh wrote:
> We have been seeing two main stability issues that uvc gadget driver
> runs into when stopping streams:
>  1. Attempting to queue usb_requests to a disabled usb_ep
>  2. use-after-free issue for inflight usb_requests
>
> The three patches below fix the two issues above. Patch 1/3 fixes the
> first issue, and Patch 2/3 and 3/3 fix the second issue.
>
> Avichal Rakesh (3):
>   usb: gadget: uvc: prevent use of disabled endpoint
>   usb: gadget: uvc: Allocate uvc_requests one at a time
>   usb: gadget: uvc: Fix use-after-free for inflight usb_requests
>
> drivers/usb/gadget/function/f_uvc.c     |  11 +-
> drivers/usb/gadget/function/f_uvc.h     |   2 +-
> drivers/usb/gadget/function/uvc.h       |   6 +-
> drivers/usb/gadget/function/uvc_v4l2.c  |  21 ++-
> drivers/usb/gadget/function/uvc_video.c | 189 +++++++++++++++++-------
> 5 files changed, 164 insertions(+), 65 deletions(-)

These patches are not applying on gregkh/usb-testing since
Greg did take my patches first. I have already rebased them.

I think they got merged too soon :-( We could fix things on top, but
there's very little time to do so for v6.7.

Agreed. I was jumping from one workaround to another one, since this
is not easy to fix in a proper way. And still after this long discussion
with Avichal I don't think we are there yet.


So far the first two patches from Avichal look legit. But the overall
Use-After-Free fix is yet to be done properly.

The "abondoned" method he suggested is really bad to follow and will
add too much complexity and will be hard to debug.

IMHO it should be possible to introduce two cleanup pathes.

One path would be in the uvc_cleanup_requests that will cleanup the
requests that are actually not used in the controller and are registered
in the req_free list.

The second path would be the complete functions that are being run
from the controller and will ensure that the cleanup will really free
the requests from the controller after they were consumed.

What do you think?

I am not sure I follow. Patch 3/3 does exactly what you say here.

Yes, it was just to summ up what the latest state of the idea was,
so Laurent does not read the whole thread in detail. Sorry for not
being clear enough about that.

Whoops! Sorry about the misunderstanding!


There are two cleanup paths:
 1. uvcg_video_disable cleans up only the requests in req_free, and
 2. complete handler cleans up the in-flight requests.

The "abandoned" flag is simply to let the completion handler know
which requests to clean up and which ones to re-queue back to
the gadget driver.

What I don't get is, why in the case of shutdown there needs to
be something re-queued back to the gadget driver. There should not
need to be any sort of barrier flag for the requests. Just the
complete handler running past a barrier where it knows that the
whole device is stopped. So every call on complete should then clean
that exact request it is touching currently.

I don't know where the extra complexity comes from.

A lot of this complexity comes from assuming a back to back
STREAMOFF -> STREAMON sequence is possible where the gadget driver
doesn't have the time to clean up all in-flight usb_requests.
However, looking through the usb gadget APIs again, and it
looks like  usb_ep_disable enforces that all requests will
be sent back to the gadget driver before it returns.

Great!

Uhh...apologies, I will have to take this back. I've been
trying to use uvc->state as the condition for when completion
handler should clean up usb_requests, and I cannot figure
out a way to do so cleanly.

The fundamental problem with using uvc->state is that it is
not protected by any locks. So there is no real way to
assert that its value has not changed between reading
uvc->state and acting on it.

Naively we can write something like the following in the
completion handler:

void uvc_video_complete(...) {
if (uvc->state != UVC_EVENT_STREAMING) {
usb_ep_free_request(....);
} else {
// handle usb_request normally
}
}

But without any locks, there are no guarantees that
uvc->state didn't mutate immediately after the if
condition was checked, and the complete handler is
handling a request that it should've freed instead
or vice-versa. This argument would hold for any logic
we guard with uvc->state, making uvc->state effectively
useless as a check for freeing memory.

Yes, this makes total sense. Since the above condition was also part of
the wait_event patch you created in the first place, I bet this issue
was there aswell and was probably causing the issues I saw while testing
it.


We can work around it by either
1. Locking uvc->state with some driver level lock
to ensure that we can trust the value of uvc->state
at least for a little while, or
2. Using some other barrier condition that is protected by
another lock

If we go with (1), we'd have to add a lock around every
and every write to uvc->state, which isn't terrible, but
would require more testing to ensure that it doesn't
create any new deadlocks.

For (2), with the realization that usb_ep_disable flushes
all requests, we can add a barrier in uvc_video, protected by
req_lock. That should simplify the logic a little bit and
will hopefully be easier to reason about.

I could of course be missing a simpler solution here,
and am happy to be wrong. So please let me know if you
have any other ideas on how to guarantee such a check.

For now, I have no better Idea. Idea (2) sounds like
a good compromise. But I will have to review that code
to really judge.

Thanks for the work!

Michael


--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Attachment: signature.asc
Description: PGP signature