Re: [PATCH 0/3] platfom/x86: asus-wmi: revert ROG Ally quirks

From: Hans de Goede
Date: Sat Oct 05 2024 - 17:59:35 EST


Hi Luke,

On 5-Oct-24 9:48 PM, Luke Jones wrote:
> Hi Hans,
>
> On Sun, 6 Oct 2024, at 3:37 AM, Hans de Goede wrote:
>> Hi Luke,
>>
>> On 26-Sep-24 11:53 AM, Luke D. Jones wrote:
>>> The ASUS ROG Ally (and Ally X) quirks that I added over the last year
>>> are not required. I worked with ASUS to pinpoint the exact cause of
>>> the original issue (MCU USB dev missing every second resume) and the
>>> result is a new MCU firmware which will be released on approx 16/10/24.
>>
>> First of all let me say that it is great that you have gotten Asus
>> to come up with a fixed firmware, thank you.
>>
>> With that said I believe that it is way too early to revert these quirks,
>> users are usually not great at installing BIOS updates and that assumes
>> this will be handled as part of a BIOS update, if it requires running
>> a separate tool then the chances of users not installing the update
>> will likely be even worse.
>>
>> So IMHO for now we should keep these quirks around to avoid regressions
>> for users who don't have the MCU update.
>
> I wasn't sure how best to handle it, mostly the intention was to publicise things. In any case the quirks don't affect the new FW update at all and most folks won't ever notice.

I think we can look at dropping the quirks in maybe a year from now
or some such. Doing it right now feels like a bit to quick after
the fw fix.

And as mentioned elsewhere in the thread, if possible it would be
good if some other driver. e.g. hid-asus could check the FW version
and log a warning if the old version is still found.

>> Related, have you seen this series:
>>
>> https://lore.kernel.org/platform-driver-x86/20240922172258.48435-1-lkml@xxxxxxxxxxx/
>>
>> that seems to fix the same issue ?
>
> The history of that is here https://lore.kernel.org/linux-pm/20240919171952.403745-1-lkml@xxxxxxxxxxx/#t
>
>> And it does so in another, arguably better way.
>
> It is a variation of the many many things I've tried while building a comprehensive set of data for ASUS to work with. You can achieve a similar thing with only s2idle_pm callbacks and Mario's patches to export the DSM screen-off as an external symbol. Better is subjective since it still fails to fix the initial reason this work ever started - fixing the Ally - unless delays are added.
>
>> Although unfortunately as patch 3/5 shows just calling the global
>> "display off" callback before suspending devices is not enough
>> fixing things still requires inserting a sleep using a DMI quirk :|
>
> This is because the issue can only be fully fixed in FW. What is happening here is just another variation of the quirk and the things I mentioned above. It gets worse with different compiler such as clang, or different kernel config, or even distro. The cause of issues is that a particular signal the MCU is waiting on may not occur and that becomes wildly unpredictable depending on kernel config, compiler etc.
>
> Even Windows can have the issue we have here.
>
>> Still that series including the DMI quirk might be a cleaner way
>> to deal with this and if that is merged then dropping the quirks
>> from asus-wmi makes sense.
>
> All of this is fully negated by the coming firmware. Having said that, *if* there are any issues with these patches then those issues will never come to light with the new MCU FW either as it fixes the root cause of the issues seen.

That sounds great, once more thank you for working with Asus to
properly fix this.

> The mentioned patches achieve a similar result to using Mario's s2idle callback patches and using those in s2idle_pm_ops. But as seen above, the timing issue becomes apparent - and this is fixed only by using fixed FW.

Right. As I mentioned already in the other thread I am having second
doubts about moving the LPS0 display power off call to before devices
are suspended, doing so would mean that the display might still be on
when that call is made and that call could disable power-resources which
are necessary for the display causing issues when the display driver's
suspend method runs.

So I think that we need something closer to Mario's original POC from:

https://git.kernel.org/pub/scm/linux/kernel/git/superm1/linux.git/log/?h=superm1/dsm-screen-on-off

if we want to make the suspend order more like Windows and make
the LPS0 display off call when the last display is turned off.

And as you have explained making the suspend order more like Windows
is unrelated to the real cause for the ROG Ally MCU suspend issue,
so lets continue any discussion about suspend ordering in the other
thread.

Regards,

Hans