Re: Fwd: Steam Deck OLED 6.8.2 nau8821-max fails

From: Linux regression tracking (Thorsten Leemhuis)
Date: Tue Apr 09 2024 - 05:20:07 EST


On 09.04.24 10:47, Cristian Ciocaltea wrote:
> On 4/9/24 11:04 AM, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 09.04.24 09:42, Cristian Ciocaltea wrote:
>>> On 4/9/24 7:44 AM, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 09.04.24 01:44, Cristian Ciocaltea wrote:
>>>>> On 4/7/24 10:47 AM, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>>> On 06.04.24 15:08, Bagas Sanjaya wrote:
>>>>>>> On Bugzilla, Daniel <dmanlfc@xxxxxxxxx> reported topology regression
>>>>>>> on Steam Deck OLED [1]. He wrote:
>>>>>
>>>>>>>> I'm adding this here, I hope it's the correct place.
>>>>>>>> Currently the Steam Deck OLED fails with Kernel 6.8.2 when trying to initialise the topology for the device.
>>>>>>>> I'm using the `sof-vangogh-nau8821-max.tplg` file from the Steam Deck OLED and associated firmware.
>>>>>>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=218677
>>>>>> A quick search made me find these posts/threads that foreshadow the problem:
>>>>>>
>>>>>> https://lore.kernel.org/lkml/20231219030728.2431640-1-cristian.ciocaltea@xxxxxxxxxxxxx/
>>>>>> https://lore.kernel.org/all/a3357e1f-f354-4d4b-9751-6b2182dceea6@xxxxxxx/
>>>>>>
>>>>>> From a quick look at the second discussion it seems a bit like we are
>>>>>> screwed, as iiutc topology files are out in the wild for one or the
>>>>>> other approach. So we might have to bite a bullet there and accept the
>>>>>> regression -- but I might easily be totally mistaken here. Would be good
>>>>>> in one of the experts (Venkata Prasad Potturu maybe?) could quickly
>>>>>> explain what's up here.
>>>>>
>>>>> The problem here is that Steam Deck OLED provides a topology file which
>>>>> uses an incorrect DAI link ID for BT codec.
>>>>>
>>>>> Patch [1] moves BT_BE_ID to position 2 in the enum, as expected by the
>>>>> topology, but this is not a change that can be accepted upstream as it
>>>>> would break other devices which rely on BT_BE_ID set to 3.
>>>>>
>>>>> The proper solution would be to update the topology file on Steam Deck,
>>>>> but this is probably not straightforward to be accomplished as it would
>>>>> break the compatibility with the currently released (downstream)
>>>>> kernels.
>>>>>
>>>>> Hopefully, this sheds some more light on the matter.
>>>>>
>>>>> [1]: https://lore.kernel.org/all/20231209205351.880797-11-cristian.ciocaltea@xxxxxxxxxxxxx/
>>>>
>>>> Many thx, yes, this sheds some light on the matter. But there is one
>>>> remaining question: can we make both camps happy somehow? E.g. something
>>>> along the lines of "first detect if the topology file has BT_BE_ID in
>>>> position 2 or 3 and then act accordingly?
>>>
>>> Right, I have this on my TODOs list but haven't managed to dig into it
>>> yet. However, that would be most likely just another hack to be carried
>>> on until the transition to a fixed topology is completed.
>>
>> Well, sure it's a hack, but the thing is, our number one rule is "no
>> regressions" and the reporter apparently faces one (see start of the
>> thread). So to fulfill this rule it would be ideal to have a fix
>> available soonish or revert the culprit and reply it later together with
>> the fix.
>
> Hmm, unless I'm missing something, this shouldn't been considered a
> regression. As I explained previously, the OLED model was launched with
> a downstream implementation of the Vangogh SOF drivers on top of v6.1,
> as there was no upstream support back then.
>
> When AMD eventually completed the upstreaming process of their SOF
> drivers in v6.6, we ended up with this unfortunate ID assignments
> incompatibility. Hence I cannot see how the mainline kernel would have
> worked without applying patch [1] above, unless the reporter
> experimented with a different topology (which is not the case if I got
> this right).
>
>> Do we know which change that went into 6.8 caused this? Or is a revert
>> out-of-the question as it will likely break things for other users that
>> already upgraded to 6.8 and have a matching topology file? (/me fears
>> the answer to the latter question is "yes", but I have to ask :-/)
>
> We need to understand how the reporter got this working with mainline
> kernels without applying any out-of-tree patches.

Ahh, okay, thx, now I understand this better. You are most likely
correct. It also made me look at the initial report again where I
noticed "When *I manually patched support* for the 6.6 or 6.7 mainline
kernel it worked fine.", so yes, this likely is not a regression.

Thx for your help and sorry for the trouble I caused!

Ciao, Thorsten