Re: [PATCH] media: uvcvideo: Filter hw errors while enumerating controls

From: Hans de Goede
Date: Thu Dec 19 2024 - 11:29:15 EST


Hi,

On 19-Dec-24 5:18 PM, Ricardo Ribalda wrote:
> On Thu, 19 Dec 2024 at 17:05, Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
>>
>> Hi,
>>
>> On 19-Dec-24 4:53 PM, Ricardo Ribalda wrote:
>>> On Thu, 19 Dec 2024 at 16:41, Laurent Pinchart
>>> <laurent.pinchart@xxxxxxxxxxxxxxxx> wrote:
>>>>
>>>> On Thu, Dec 19, 2024 at 04:35:37PM +0100, Ricardo Ribalda wrote:
>>>>> On Thu, 19 Dec 2024 at 15:41, Laurent Pinchart wrote:
>>>>>> On Thu, Dec 19, 2024 at 09:17:31AM +0100, Ricardo Ribalda wrote:
>>>>>>> On Thu, 19 Dec 2024 at 00:27, Laurent Pinchart wrote:
>>>>>>>> On Fri, Dec 13, 2024 at 11:21:02AM +0000, Ricardo Ribalda wrote:
>>>>>>>>> To implement VIDIOC_QUERYCTRL, we need to read from the hardware all the
>>>>>>>>> values that were not cached previously. If that read fails, we used to
>>>>>>>>> report back the error to the user.
>>>>>>>>>
>>>>>>>>> Unfortunately this does not play nice with userspace. When they are
>>>>>>>>> enumerating the contols, the only expect an error when there are no
>>>>>>>>> "next" control.
>>>>>>>>>
>>>>>>>>> This is probably a corner case, and could be handled in userspace, but
>>>>>>>>> both v4l2-ctl and yavta fail to enumerate all the controls if we return
>>>>>>>>> then -EIO during VIDIOC_QUERYCTRL. I suspect that there are tons of
>>>>>>>>> userspace apps handling this wrongly as well.
>>>>>>>>>
>>>>>>>>> This patch get around this issue by ignoring the hardware errors and
>>>>>>>>> always returning 0 if a control exists.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Ricardo Ribalda <ribalda@xxxxxxxxxxxx>
>>>>>>>>> ---
>>>>>>>>> Hi 2*Hans and Laurent!
>>>>>>>>>
>>>>>>>>> I came around a device that was listing just a couple of controls when
>>>>>>>>> it should be listing much more.
>>>>>>>>>
>>>>>>>>> Some debugging latter I found that the device was returning -EIO when
>>>>>>>>> all the focal controls were read.
>>>>>>>>
>>>>>>>> Was it transient and random errors, or does the device always fail for
>>>>>>>> those controls ?
>>>>>>>
>>>>>>> For one of the devices the control is always failing (or I could not
>>>>>>> find a combination that made it work).
>>>>>>>
>>>>>>> For the other it was more or less random.
>>>>>>
>>>>>> Are there other controls that failed for that device ? And what
>>>>>> request(s) fail, is it only GET_CUR or also GET_MIN/GET_MAX/GET_RES ?
>>>>>
>>>>> It is a mix.
>>>>>
>>>>>> What's the approximate frequency of those random failures ?
>>>>>>
>>>>>>>>> This could be solved in userspace.. but I suspect that a lot of people
>>>>>>>>> has copied the implementation of v4l-utils or yavta.
>>>>>>>>>
>>>>>>>>> What do you think of this workaround?
>>>>>>>>
>>>>>>>> Pretending that the control could be queried is problematic. We'll
>>>>>>>> return invalid values to the user, I don't think that's a good idea. If
>>>>>>>> the problematic device always returns error for focus controls, we could
>>>>>>>> add a quirk, and extend the uvc_device_info structure to list the
>>>>>>>> controls to ignore.
>>>>>>>>
>>>>>>>> Another option would be to skip over controls that return -EIO within
>>>>>>>> the kernel, and mark those controls are broken. I think this could be
>>>>>>>> done transparently for userspace, the first time we try to populate the
>>>>>>>> cache for such controls, a -EIO error would mark the control as broken,
>>>>>>>> and from a userspace point of view it wouldn't be visible through as
>>>>>>>> ioctl.
>>>>>>>
>>>>>>> I see a couple of issues with this:
>>>>>>> - There are controls that fail randomly.
>>>>>>> - There are controls that fail based on the value of other controls
>>>>>>> (yeah, I know).
>>>>>>
>>>>>> I was fearing there would be random (or random-looking) failures, as
>>>>>> that can preclude marking the controls as broken and fully hiding them
>>>>>> from userspace :-(
>>>>>>
>>>>>>> - There are controls that do not implement RES, MIN, or MAX, but
>>>>>>> besides that, they are completely functional.
>>>>>>> In any of those cases we do not want to skip those controls.
>>>>>>>
>>>>>>> I am not against quirking specific cameras once we detect that they
>>>>>>> are broken...
>>>>>>
>>>>>> Hopefully there won't be too many of those, right ? Righhhht... ?
>>>>>
>>>>> So far I have identified 4 in a week, and I am not testing obscure
>>>>> camera modules....
>>>>
>>>> Can you provide more information about those modules ? USB descriptors
>>>> maybe, and the list of controls that fail, and how they fail ?
>>>
>>> These are the ones I can share now:
>>>
>>> "13d3:5519": Focus value out of range
>>> focus_absolute 0x009a090a (int) : min=355 max=790 step=1 default=6
>>> value=500 flags=inactive
>>
>> Hmm this one looks like min and default are swapped ?
>>
>> So I guess this one needs a quirk which checks if default < min
>> and in that case swaps them (the check is to avoid swapping
>> with fixed fw). If these are built into chromebooks how about
>> doing a fwupdate for the camera instead ?
>
> We do fwupdate whenever possible. But some modules are not updateable.
> They either: lack DFU, or the flash is read-only, or the update
> process has a non acceptable fail rate.

Ok, I was just wondering if we could avoid having to add
a quirk for this model.

> We aim to detect compliance errors early in the development process.
> V4L2-compliance now (almost) works with the uvcvideo driver. And that
> helps a lot :)
>
> I plan to add quirks for the cameras that I can test. But we still
> need a solution for all the external cameras and modules that are not
> in the lab.

I agree we need some solution for this, especially the broken
controls hiding others is a problem which needs some workaround.

I'm not sure what that workaround should look like though.
Just returning 0 as your v2 patch does seems less then ideal.

Lets continue discussing this after the Christmas break.

Regards,

Hans



>>> "3277:0003": Focus returns -EIO
>>> Focus Absolute and Focus, Automatic Continuous: return -EIO for at
>>> least one of get_ max/min/res
>>>
>>> "0408:302f": Error reading AutoExposure Flags
>>> UVC_GET_INFO returns invalid flags
>>
>> Regards,
>>
>> Hans
>>
>>
>>
>
>