Re: [PATCH v1 0/3] usb: gadget: uvc: stability fixes on STREAMOFF.

From: Avichal Rakesh
Date: Wed Oct 11 2023 - 20:33:59 EST




On 10/8/23 12:48, Michael Grzeschik wrote:
> On Fri, Oct 06, 2023 at 04:48:19PM -0700, Avichal Rakesh wrote:
>> On 10/6/23 15:53, Michael Grzeschik wrote:
>>> On Fri, Oct 06, 2023 at 10:00:11AM -0700, Avichal Rakesh wrote:
>>>>
>>>>
>>>> On 10/5/23 15:05, Michael Grzeschik wrote:
>>>>> Hi Avichal,
>>>>>
>>>>> On Thu, Oct 05, 2023 at 11:30:32AM -0700, Avichal Rakesh wrote:
>>>>>> On 10/5/23 03:14, Michael Grzeschik wrote:
>>>>> <snip>
>>>>> I don't know where the extra complexity comes from.
>>>>
>>>> A lot of this complexity comes from assuming a back to back
>>>> STREAMOFF -> STREAMON sequence is possible where the gadget driver
>>>> doesn't have the time to clean up all in-flight usb_requests.
>>>> However, looking through the usb gadget APIs again, and it
>>>> looks like  usb_ep_disable enforces that all requests will
>>>> be sent back to the gadget driver before it returns.
>>>
>>> Great!
>>
>> Uhh...apologies, I will have to take this back. I've been
>> trying to use uvc->state as the condition for when completion
>> handler should clean up usb_requests, and I cannot figure
>> out a way to do so cleanly.
>>
>> The fundamental problem with using uvc->state is that it is
>> not protected by any locks. So there is no real way to
>> assert that its value has not changed between reading
>> uvc->state and acting on it.
>>
>> Naively we can write something like the following in the
>> completion handler:
>>
>> void uvc_video_complete(...) {
>>    if (uvc->state != UVC_EVENT_STREAMING) {
>>        usb_ep_free_request(....);
>>    } else {
>>        // handle usb_request normally
>>    }
>> }
>>
>> But without any locks, there are no guarantees that
>> uvc->state didn't mutate immediately after the if
>> condition was checked, and the complete handler is
>> handling a request that it should've freed instead
>> or vice-versa. This argument would hold for any logic
>> we guard with uvc->state, making uvc->state effectively
>> useless as a check for freeing memory.
>
> Yes, this makes total sense. Since the above condition was also part of
> the wait_event patch you created in the first place, I bet this issue
> was there aswell and was probably causing the issues I saw while testing
> it>
>
>> We can work around it by either
>> 1. Locking uvc->state with some driver level lock
>>   to ensure that we can trust the value of uvc->state
>>   at least for a little while, or
>> 2. Using some other barrier condition that is protected by
>>   another lock
>>
>> If we go with (1), we'd have to add a lock around every
>> and every write to uvc->state, which isn't terrible, but
>> would require more testing to ensure that it doesn't
>> create any new deadlocks.
>>
>> For (2), with the realization that usb_ep_disable flushes
>> all requests, we can add a barrier in uvc_video, protected by
>> req_lock. That should simplify the logic a little bit and
>> will hopefully be easier to reason about.
>>
>> I could of course be missing a simpler solution here,
>> and am happy to be wrong. So please let me know if you
>> have any other ideas on how to guarantee such a check.
>
> For now, I have no better Idea. Idea (2) sounds like
> a good compromise. But I will have to review that code
> to really judge.
>

Sent out v4 patches with option (2). It simplifies the logic
decently because we no longer have to reason about per-request
consistency. uvc_video now tracks its own state of whether
requests should be flowing or not based on calls to
uvcg_video_enable. This state is protected, and is the source
of truth for queueing usb_requests.

The last bit of complexity left is around returning in-flight
video buffers. AFAICT it should now be protected, and in my
local testing I didn't notice any un-returned buffers, but
please to take a look and let me know if your testing
uncovers anything.

Thanks,
Avi.