Re: Bug 215744 - input from the accelerometer disappeared, regression on amd_sfh on kernel 5.17

From: Basavaraj Natikar
Date: Mon Apr 04 2022 - 10:04:32 EST




On 4/4/2022 7:23 PM, Thorsten Leemhuis wrote:
> On 04.04.22 09:25, Thorsten Leemhuis wrote:
>> On 04.04.22 09:18, Basavaraj Natikar wrote:
>>>
>>> On 4/4/2022 12:05 PM, Thorsten Leemhuis wrote:
>>>> On 01.04.22 21:47, Basavaraj Natikar wrote:
>>>>> Committed patch is disabling the interrupt mode and does not cause any
>>>>> functionality or working issues.
>>>> Well, for the reporter it clearly does cause problems, unless something
>>>> in testing went sideways.
>>>>
>>>>> I also cross verified on 3 system and working fine on 5.17 and not able
>>>>> to reproduce or recreate.
>>>>> [...]
>>>>> ------------------------------------------------
>>>>>
>>>>> Looks like this is not regression. May be some hardware/firmware bug.
>>>> Well, from the point of the kernel development process it afaics is a
>>>> regression, unless the testing went sideways. It doesn't matter if the
>>>> root cause is in fact a hardware/firmware bug, as what matters in the
>>>> scope of the kernel development is: things worked, and now they don't.
>>>> For details please check this file and read the quotes from Linus:
>>> can you help to answer the below questions:
>> Me? No, I'm just the Linux kernels regression tracker trying to make
>> sure all regressions are handled appropriately. :-D
>>
>> Marco, can you help out here?
> Marco replied in private and allowed me to forward his reply:
>
> ```
> I can't since, as mentioned previously, this is the only AMD laptop
> device that I have.
> I am sure this is a regression for me, even if the issue is firmware
> related. I have tested the 5.17 stock arch kernel and the sensor is
> gone. With the last three patches reverted the sensor is back and
> working fine.
>
> I would love to verify if the issue is hardware or software related, but
> being outside of AMD and with AMD not releasing any public information
> with datasheet/specification on their Sensor Fusion Hub I really can't
> say anything specific.
>
> This still remains a regression, since the hardware was working before
> and now it doesn't.
>
> By the way, I already have seen also this rework of this specific driver
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fconqp%2Famd-sfh-hid-dkms&data=04%7C01%7CBasavaraj.Natikar%40amd.com%7C05891d0582f94d68e7f908da164272ca%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637846771908092322%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nAmJdcP2ALl9cEaejWezLb9B3bU5Omp72Z6kjTbaBRY%3D&reserved=0 that even added a still
> missing handler for the lid sensor switch for disabling touchpad and
> keyboard, and all efforts tried to merge it upstream with all sorts of
> issues.
>
> Regardless of everything, this is a driver supported in kernel by AMD
> engineers, so all of this doesn't matter. On my hardware this three
> patches break a previously working hardware.
> ```

Thank You Marco for the information.

Could you please provide me below results for acceleration
by re-applying and reverting patch once again on same laptop.

Did you attempt to test it multiple times on the tip of the git to see
if the problems goes away?

if same test is performed multiple times with or without reverting patch
on same laptop on which issue is observed
we may see same working/issue behavior. if it is regressing then always
it should work with or without reverting patches on same laptop. is this
the case here?

Thanks,

Basavaraj


> Ciao, Thorsten
>
>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>>
>> P.S.: As the Linux kernel's regression tracker I'm getting a lot of
>> reports on my table. I can only look briefly into most of them and lack
>> knowledge about most of the areas they concern. I thus unfortunately
>> will sometimes get things wrong or miss something important. I hope
>> that's not the case here; if you think it is, don't hesitate to tell me
>> in a public reply, it's in everyone's interest to set the public record
>> straight.
>>
>>> Did you attempt to test it multiple times on the tip of the git to see
>>> if the problems goes away?
>>>
>>> if same test is performed multiple times with or without reverting patch
>>> on same platform (laptop/hardware/firmware) on which issue is observed
>>> we may see same working/issue behavior. if it is regressing then always
>>> it should work with or without reverting patches on same laptop. is this
>>> the case here?
>>>
>>> I don't see any regression here. I requested to retest with other
>>> hardware/platform/system also as per my above test (output) all working
>>> fine in 3 different platforms and not able to reproduce or recreate for
>>> my side on 5.17.
>>>
>>> Thanks,
>>>
>>> Basavaraj
>>>
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fnext%2Flinux-next.git%2Fplain%2FDocumentation%2Fprocess%2Fhandling-regressions.rst&data=04%7C01%7CBasavaraj.Natikar%40amd.com%7C05891d0582f94d68e7f908da164272ca%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637846771908092322%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=9ORA2inmfocvEJ%2FXOov67q2ebzZrcuguViSPV%2B58yA0%3D&reserved=0
>>>>
>>>> Ciao, Thorsten
>>>>
>>>>> Just curious reverting this patch how it is working just suspecting
>>>>> firmware undefined behavior.
>>>>>
>>>>> If possible, please check on other platform/system also if same behavior
>>>>> occurs.
>>>>>
>>>>> Could you please provide me platform/system details so that I can check
>>>>> this behavior?
>>>>>
>>>>> Thanks,
>>>>> Basavaraj
>>>>>
>>>>> On 4/1/2022 1:36 PM, Thorsten Leemhuis wrote:
>>>>>> Hi, this is your Linux kernel regression tracker.
>>>>>>
>>>>>> I noticed a regression report in bugzilla.kernel.org that afaics nobody
>>>>>> acted upon since it was reported about a week ago, that's why I decided
>>>>>> to forward it to the lists and all people that seemed to be relevant
>>>>>> here. It looks to me like this is something for Basavaraj, as it seems
>>>>>> to be caused by b300667b33b2 ("HID: amd_sfh: Disable the interrupt for
>>>>>> all command"). But I'm not totally sure, I only looked briefly into the
>>>>>> details. Or was this discussed somewhere else already? Or even fixed?
>>>>>>
>>>>>> To quote from https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D215744&data=04%7C01%7CBasavaraj.Natikar%40amd.com%7C05891d0582f94d68e7f908da164272ca%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637846771908092322%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=pLGKHCV1ZNrXghGdDFG16sguRA8xi0VzUSG%2Fbw%2FQsBw%3D&reserved=0 :
>>>>>>
>>>>>>> Marco 2022-03-25 15:22:19 UTC
>>>>>>>
>>>>>>> After updating to 5.17, the input from the accelerometer disappeared, completely. No devices available from IIO tree. First bad commit causing it is https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2Fdrivers%2Fhid%2Famd-sfh-hid%2Famd_sfh_pcie.c%3Fid%3Db300667b33b2b5a2c8e5f8f22826befb3d7f4f2b&data=04%7C01%7CBasavaraj.Natikar%40amd.com%7C05891d0582f94d68e7f908da164272ca%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637846771908092322%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=L36JFc%2BnejM4bJXfNui49v2uJKrS4cJnU93dpEhXPms%3D&reserved=0. Reverting this and the the other two on top fixed this. Tried to not revert only the above mentioned commit, but it's still not working.
>>>>>>>
>>>>>>> Marco.
>>>>>> Anyway, to get this tracked:
>>>>>>
>>>>>> #regzbot introduced: b300667b33b2b5a2c8e5f8f22826befb3d7f4
>>>>>> #regzbot from: Marco <rodomar705@xxxxxxxxxxxxxx>
>>>>>> #regzbot title: input: hid: input from the accelerometer disappeared due
>>>>>> to changes to amd_sfh
>>>>>> #regzbot link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D215744&amp;data=04%7C01%7CBasavaraj.Natikar%40amd.com%7C05891d0582f94d68e7f908da164272ca%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637846771908092322%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=pLGKHCV1ZNrXghGdDFG16sguRA8xi0VzUSG%2Fbw%2FQsBw%3D&amp;reserved=0
>>>>>>
>>>>>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>>>>>>
>>>>>> P.S.: As the Linux kernel's regression tracker I'm getting a lot of
>>>>>> reports on my table. I can only look briefly into most of them and lack
>>>>>> knowledge about most of the areas they concern. I thus unfortunately
>>>>>> will sometimes get things wrong or miss something important. I hope
>>>>>> that's not the case here; if you think it is, don't hesitate to tell me
>>>>>> in a public reply, it's in everyone's interest to set the public record
>>>>>> straight.
>>>>>>
>>>
>>>