Re: [PATCH v2 2/4] media: uvcvideo: Do not set an async control owned by other fh
From: Laurent Pinchart
Date: Mon Dec 02 2024 - 03:12:29 EST
On Mon, Dec 02, 2024 at 09:05:07AM +0100, Hans Verkuil wrote:
> On 02/12/2024 01:18, Laurent Pinchart wrote:
> > On Fri, Nov 29, 2024 at 11:18:54PM +0100, Ricardo Ribalda wrote:
> >> On Fri, 29 Nov 2024 at 23:03, Laurent Pinchart wrote:
> >>> On Fri, Nov 29, 2024 at 07:47:31PM +0100, Ricardo Ribalda wrote:
> >>>> Before we all go on a well deserved weekend, let me recap what we
> >>>> know. If I did not get something correctly, let me know.
> >>>>
> >>>> 1) Well behaved devices do not allow to set or get an incomplete async
> >>>> control. They will stall instead (ref: Figure 2-21 in UVC 1.5 )
> >>>> 2) Both Laurent and Ricardo consider that there is a big chance that
> >>>> some camera modules do not implement this properly. (ref: years of
> >>>> crying over broken module firmware :) )
> >>>>
> >>>> 3) ctrl->handle is designed to point to the fh that originated the
> >>>> control. So the logic can decide if the originator needs to be
> >>>> notified or not. (ref: uvc_ctrl_send_event() )
> >>>> 4) Right now we replace the originator in ctrl->handle for unfinished
> >>>> async controls. (ref:
> >>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/media/usb/uvc/uvc_ctrl.c#n2050)
> >>>>
> >>>> My interpretation is that:
> >>>> A) We need to change 4). We shall not change the originator of
> >>>> unfinished ctrl->handle.
> >>>> B) Well behaved cameras do not need the patch "Do not set an async
> >>>> control owned by another fh"
> >>>> C) For badly behaved cameras, it is fine if we slightly break the
> >>>> v4l2-compliance in corner cases, if we do not break any internal data
> >>>> structure.
> >>>
> >>> The fact that some devices may not implement the documented behaviour
> >>> correctly may not be a problem. Well-behaved devices will stall, which
> >>> means we shouldn't query the device while as async update is in
> >>> progress. Badly-behaved devices, whatever they do when queried, should
> >>> not cause any issue if we don't query them.
> >>
> >> I thought we could detect the stall and return safely. Isn't that the case?
> >
> > We could, but if we know the device will stall anyway, is there a reason
> > not to avoid issuing the request in the first place ?
> >
> >> Why we have not seen issues with this?
> >
> > I haven't tested a PTZ device for a very long time, and you would need
> > to hit a small time window to see the issue.
> >
> >>> We should not send GET_CUR and SET_CUR requests to the device while an
> >>> async update is in progress, and use cached values instead. When we
> >>> receive the async update event, we should clear the cache. This will be
> >>> the same for both well-behaved and badly-behaved devices, so we can
> >>> expose the same behaviour towards userspace.
> >>
> >> seting ctrl->loaded = 0 when we get an event sounds like a good idea
> >> and something we can implement right away.
> >> If I have to resend the set I will add it to the end.
> >>
> >>> We possibly also need some kind of timeout mechanism to cope with the
> >>> async update event not being delivered by the device.
> >>
> >> This is the part that worries me the most:
> >> - timeouts make the code fragile
> >> - What is a good value for timeout? 1 second, 30, 300? I do not think
> >> that we can find a value.
> >
> > I've been thinking about the implementation of uvc_fh cleanup over the
> > weekend, and having a timeout would have the nice advantage that we
> > could reference-count uvc_fh instead of implementing a cleanup that
> > walks over all controls when closing a file handle. I think it would
> > make the code simpler, and possibly safer too.
> >
> >>> Regarding the userspace behaviour during an auto-update, we have
> >>> multiple options:
> >>>
> >>> For control get,
> >>>
> >>> - We can return -EBUSY
> >>> - We can return the old value from the cache
>
> This would match the control behavior best. Only when the operation is
> done is the control updated and the control event sent.
>
> Some questions: is any of this documented for UVC? Because this is non-standard
No this isn't documented.
> behavior. Are there applications that rely on this? Should we perhaps add
I don't know.
> proper support for this to the control framework? E.g. add an ASYNC flag and
> document this?
We could, but this is such a specific use case that I don't think is
worth adding complexity to the already complex control framework would
be worth it. What we could do is perhaps adding a flag for the userspace
API, but even there, I never like modelling an API with a single user.
> >>> - We can return the new value fromt he cache
> >>>
> >>> Returning -EBUSY would be simpler to implement.
> >>
> >> Not only easy, I think it is the most correct,
> >>
> >>> I don't think the behaviour should depend on whether the control is read
> >>> on the file handle that initiated the async operation or on a different
> >>> file handle.
> >>>
> >>> For control set, I don't think we can do much else than returning
> >>> -EBUSY, regardless of which file handle the control is set on.
> >>
> >> ACK.
> >>
> >>>> I will send a new version with my interpretation.
> >>>>
> >>>> Thanks for a great discussion
> >>
> >> Looking with some perspective... I believe that we should look into
> >> the "userspace behaviour for auto controls" in a different patchset.
> >> It is slightly unrelated to this discussion.
--
Regards,
Laurent Pinchart