Re: [PATCH] media: uvcvideo: Filter hw errors while enumerating controls

From: Ricardo Ribalda
Date: Thu Dec 19 2024 - 11:19:10 EST


On Thu, 19 Dec 2024 at 17:05, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>
> Hi,
>
> On 19-Dec-24 4:53 PM, Ricardo Ribalda wrote:
> > On Thu, 19 Dec 2024 at 16:41, Laurent Pinchart
> > <laurent.pinchart@xxxxxxxxxxxxxxxx> wrote:
> >>
> >> On Thu, Dec 19, 2024 at 04:35:37PM +0100, Ricardo Ribalda wrote:
> >>> On Thu, 19 Dec 2024 at 15:41, Laurent Pinchart wrote:
> >>>> On Thu, Dec 19, 2024 at 09:17:31AM +0100, Ricardo Ribalda wrote:
> >>>>> On Thu, 19 Dec 2024 at 00:27, Laurent Pinchart wrote:
> >>>>>> On Fri, Dec 13, 2024 at 11:21:02AM +0000, Ricardo Ribalda wrote:
> >>>>>>> To implement VIDIOC_QUERYCTRL, we need to read from the hardware all the
> >>>>>>> values that were not cached previously. If that read fails, we used to
> >>>>>>> report back the error to the user.
> >>>>>>>
> >>>>>>> Unfortunately this does not play nice with userspace. When they are
> >>>>>>> enumerating the contols, the only expect an error when there are no
> >>>>>>> "next" control.
> >>>>>>>
> >>>>>>> This is probably a corner case, and could be handled in userspace, but
> >>>>>>> both v4l2-ctl and yavta fail to enumerate all the controls if we return
> >>>>>>> then -EIO during VIDIOC_QUERYCTRL. I suspect that there are tons of
> >>>>>>> userspace apps handling this wrongly as well.
> >>>>>>>
> >>>>>>> This patch get around this issue by ignoring the hardware errors and
> >>>>>>> always returning 0 if a control exists.
> >>>>>>>
> >>>>>>> Signed-off-by: Ricardo Ribalda <ribalda@xxxxxxxxxxxx>
> >>>>>>> ---
> >>>>>>> Hi 2*Hans and Laurent!
> >>>>>>>
> >>>>>>> I came around a device that was listing just a couple of controls when
> >>>>>>> it should be listing much more.
> >>>>>>>
> >>>>>>> Some debugging latter I found that the device was returning -EIO when
> >>>>>>> all the focal controls were read.
> >>>>>>
> >>>>>> Was it transient and random errors, or does the device always fail for
> >>>>>> those controls ?
> >>>>>
> >>>>> For one of the devices the control is always failing (or I could not
> >>>>> find a combination that made it work).
> >>>>>
> >>>>> For the other it was more or less random.
> >>>>
> >>>> Are there other controls that failed for that device ? And what
> >>>> request(s) fail, is it only GET_CUR or also GET_MIN/GET_MAX/GET_RES ?
> >>>
> >>> It is a mix.
> >>>
> >>>> What's the approximate frequency of those random failures ?
> >>>>
> >>>>>>> This could be solved in userspace.. but I suspect that a lot of people
> >>>>>>> has copied the implementation of v4l-utils or yavta.
> >>>>>>>
> >>>>>>> What do you think of this workaround?
> >>>>>>
> >>>>>> Pretending that the control could be queried is problematic. We'll
> >>>>>> return invalid values to the user, I don't think that's a good idea. If
> >>>>>> the problematic device always returns error for focus controls, we could
> >>>>>> add a quirk, and extend the uvc_device_info structure to list the
> >>>>>> controls to ignore.
> >>>>>>
> >>>>>> Another option would be to skip over controls that return -EIO within
> >>>>>> the kernel, and mark those controls are broken. I think this could be
> >>>>>> done transparently for userspace, the first time we try to populate the
> >>>>>> cache for such controls, a -EIO error would mark the control as broken,
> >>>>>> and from a userspace point of view it wouldn't be visible through as
> >>>>>> ioctl.
> >>>>>
> >>>>> I see a couple of issues with this:
> >>>>> - There are controls that fail randomly.
> >>>>> - There are controls that fail based on the value of other controls
> >>>>> (yeah, I know).
> >>>>
> >>>> I was fearing there would be random (or random-looking) failures, as
> >>>> that can preclude marking the controls as broken and fully hiding them
> >>>> from userspace :-(
> >>>>
> >>>>> - There are controls that do not implement RES, MIN, or MAX, but
> >>>>> besides that, they are completely functional.
> >>>>> In any of those cases we do not want to skip those controls.
> >>>>>
> >>>>> I am not against quirking specific cameras once we detect that they
> >>>>> are broken...
> >>>>
> >>>> Hopefully there won't be too many of those, right ? Righhhht... ?
> >>>
> >>> So far I have identified 4 in a week, and I am not testing obscure
> >>> camera modules....
> >>
> >> Can you provide more information about those modules ? USB descriptors
> >> maybe, and the list of controls that fail, and how they fail ?
> >
> > These are the ones I can share now:
> >
> > "13d3:5519": Focus value out of range
> > focus_absolute 0x009a090a (int) : min=355 max=790 step=1 default=6
> > value=500 flags=inactive
>
> Hmm this one looks like min and default are swapped ?
>
> So I guess this one needs a quirk which checks if default < min
> and in that case swaps them (the check is to avoid swapping
> with fixed fw). If these are built into chromebooks how about
> doing a fwupdate for the camera instead ?

We do fwupdate whenever possible. But some modules are not updateable.
They either: lack DFU, or the flash is read-only, or the update
process has a non acceptable fail rate.

We aim to detect compliance errors early in the development process.
V4L2-compliance now (almost) works with the uvcvideo driver. And that
helps a lot :)

I plan to add quirks for the cameras that I can test. But we still
need a solution for all the external cameras and modules that are not
in the lab.

>
> > "3277:0003": Focus returns -EIO
> > Focus Absolute and Focus, Automatic Continuous: return -EIO for at
> > least one of get_ max/min/res
> >
> > "0408:302f": Error reading AutoExposure Flags
> > UVC_GET_INFO returns invalid flags
>
> Regards,
>
> Hans
>
>
>


--
Ricardo Ribalda