Re: Fwd: Steam Deck OLED 6.8.2 nau8821-max fails
From: Cristian Ciocaltea
Date: Tue Apr 09 2024 - 07:24:52 EST
On 4/9/24 1:57 PM, Daniel Martin wrote:
> The manual patches were worked from the Steam Deck kernel, trial &
> error with partial support in 6.6 at the time.
> 6.8 sources where this has been implemented recently were not yet
> available or in linux-next.
> Suffice to say the code matches up almost perfectly apart from the
> enum issues which are thus being discussed.
> I still go back to the point, apart from the Steam Deck, who else is
> using this named topology file?
> I don't think anyone is, therefore the enum numbering should match the
> current Steam Deck kernel implementation & topology file.
The entries of enum be_id cannot be changed in mainline without breaking
devices which are not part of the Steam Deck family.
I do not have additional details other than the information provided by
AMD in the context of the initial patch submission:
https://lore.kernel.org/all/a3357e1f-f354-4d4b-9751-6b2182dceea6@xxxxxxx/
>
> On Tue, 9 Apr 2024 at 19:55, Cristian Ciocaltea
> <cristian.ciocaltea@xxxxxxxxxxxxx> wrote:
>>
>> On 4/9/24 12:19 PM, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 09.04.24 10:47, Cristian Ciocaltea wrote:
>>>> On 4/9/24 11:04 AM, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>> On 09.04.24 09:42, Cristian Ciocaltea wrote:
>>>>>> On 4/9/24 7:44 AM, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>>>> On 09.04.24 01:44, Cristian Ciocaltea wrote:
>>>>>>>> On 4/7/24 10:47 AM, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>>>>>>> On 06.04.24 15:08, Bagas Sanjaya wrote:
>>>>>>>>>> On Bugzilla, Daniel <dmanlfc@xxxxxxxxx> reported topology regression
>>>>>>>>>> on Steam Deck OLED [1]. He wrote:
>>>>>>>>
>>>>>>>>>>> I'm adding this here, I hope it's the correct place.
>>>>>>>>>>> Currently the Steam Deck OLED fails with Kernel 6.8.2 when trying to initialise the topology for the device.
>>>>>>>>>>> I'm using the `sof-vangogh-nau8821-max.tplg` file from the Steam Deck OLED and associated firmware.
>>>>>>>>>> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=218677
>>>>>>>>> A quick search made me find these posts/threads that foreshadow the problem:
>>>>>>>>>
>>>>>>>>> https://lore.kernel.org/lkml/20231219030728.2431640-1-cristian.ciocaltea@xxxxxxxxxxxxx/
>>>>>>>>> https://lore.kernel.org/all/a3357e1f-f354-4d4b-9751-6b2182dceea6@xxxxxxx/
>>>>>>>>>
>>>>>>>>> From a quick look at the second discussion it seems a bit like we are
>>>>>>>>> screwed, as iiutc topology files are out in the wild for one or the
>>>>>>>>> other approach. So we might have to bite a bullet there and accept the
>>>>>>>>> regression -- but I might easily be totally mistaken here. Would be good
>>>>>>>>> in one of the experts (Venkata Prasad Potturu maybe?) could quickly
>>>>>>>>> explain what's up here.
>>>>>>>>
>>>>>>>> The problem here is that Steam Deck OLED provides a topology file which
>>>>>>>> uses an incorrect DAI link ID for BT codec.
>>>>>>>>
>>>>>>>> Patch [1] moves BT_BE_ID to position 2 in the enum, as expected by the
>>>>>>>> topology, but this is not a change that can be accepted upstream as it
>>>>>>>> would break other devices which rely on BT_BE_ID set to 3.
>>>>>>>>
>>>>>>>> The proper solution would be to update the topology file on Steam Deck,
>>>>>>>> but this is probably not straightforward to be accomplished as it would
>>>>>>>> break the compatibility with the currently released (downstream)
>>>>>>>> kernels.
>>>>>>>>
>>>>>>>> Hopefully, this sheds some more light on the matter.
>>>>>>>>
>>>>>>>> [1]: https://lore.kernel.org/all/20231209205351.880797-11-cristian.ciocaltea@xxxxxxxxxxxxx/
>>>>>>>
>>>>>>> Many thx, yes, this sheds some light on the matter. But there is one
>>>>>>> remaining question: can we make both camps happy somehow? E.g. something
>>>>>>> along the lines of "first detect if the topology file has BT_BE_ID in
>>>>>>> position 2 or 3 and then act accordingly?
>>>>>>
>>>>>> Right, I have this on my TODOs list but haven't managed to dig into it
>>>>>> yet. However, that would be most likely just another hack to be carried
>>>>>> on until the transition to a fixed topology is completed.
>>>>>
>>>>> Well, sure it's a hack, but the thing is, our number one rule is "no
>>>>> regressions" and the reporter apparently faces one (see start of the
>>>>> thread). So to fulfill this rule it would be ideal to have a fix
>>>>> available soonish or revert the culprit and reply it later together with
>>>>> the fix.
>>>>
>>>> Hmm, unless I'm missing something, this shouldn't been considered a
>>>> regression. As I explained previously, the OLED model was launched with
>>>> a downstream implementation of the Vangogh SOF drivers on top of v6.1,
>>>> as there was no upstream support back then.
>>>>
>>>> When AMD eventually completed the upstreaming process of their SOF
>>>> drivers in v6.6, we ended up with this unfortunate ID assignments
>>>> incompatibility. Hence I cannot see how the mainline kernel would have
>>>> worked without applying patch [1] above, unless the reporter
>>>> experimented with a different topology (which is not the case if I got
>>>> this right).
>>>>
>>>>> Do we know which change that went into 6.8 caused this? Or is a revert
>>>>> out-of-the question as it will likely break things for other users that
>>>>> already upgraded to 6.8 and have a matching topology file? (/me fears
>>>>> the answer to the latter question is "yes", but I have to ask :-/)
>>>>
>>>> We need to understand how the reporter got this working with mainline
>>>> kernels without applying any out-of-tree patches.
>>>
>>> Ahh, okay, thx, now I understand this better. You are most likely
>>> correct. It also made me look at the initial report again where I
>>> noticed "When *I manually patched support* for the 6.6 or 6.7 mainline
>>> kernel it worked fine.", so yes, this likely is not a regression.
>>
>> It would be interesting to find out what the *manually patched support*
>> involved. FWIW, to get audio working with v6.8, it's also necessary to
>> backport several patches from v6.9-rc1 - I would consider the following:
>>
>> Fixes: f0f1021fc9cb ("ASoC: amd: acp: Drop redundant initialization of machine driver data")
>> Fixes: 68ab29426d88 ("ASoC: amd: acp: Make use of existing *_CODEC_DAI macros")
>> Fixes: d0ada20279db ("ASoC: amd: acp: Add missing error handling in sof-mach")
>> Fixes: 222be59e5eed ("ASoC: SOF: amd: Fix memory leak in amd_sof_acp_probe()")
>> Fixes: a13f0c3c0e8f ("ASoC: SOF: amd: Optimize quirk for Valve Galileo")
>> Fixes: 369b997a1371 ("ASoC: SOF: core: Skip firmware test for custom loaders")
>> Fixes: d9cacc1a2af2 ("ASoC: SOF: amd: Compute file paths on firmware load")
>> Fixes: 33c3d8133307 ("ASoC: SOF: amd: Move signed_fw_image to struct acp_quirk_entry")
>> Fixes: 094d11768f74 ("ASoC: SOF: amd: Skip IRAM/DRAM size modification for Steam Deck OLED")
>>
>> I think most if not all of the mandatory fixes from the list above have been
>> already included in the latest v6.8 stable updates, but I haven't actually
>> tested.
>>
>>>
>>> Thx for your help and sorry for the trouble I caused!
>>
>> No problem at all!
>>
>> Regards,
>> Cristian
>
>
>