Re: [RFC PATCH v3 05/13] ASoC: soc-pcm: align BE 'atomicity' with that of the FE
From: Pierre-Louis Bossart
Date: Fri Oct 15 2021 - 07:22:58 EST
>> In normal PCM, atomicity seems to apply only for trigger(). Other
>> callbacks like prepare, hw_params are executed in non-atomic
>> context. So when 'nonatomic' flag is false, still it is possible to
>> sleep in a prepare or hw_param callback and this is true for FE as
>> well. So I am not sure if atomicity is applicable as a whole even for
>> FE.
The typical path is snd_pcm_elapsed() on the FE, which will trigger the
BE. When we add the BE lock in patch7, things break: what matters is the
FE context, the locks used for the BE have to be aligned with the FE
atomicity.
My test results show the problem:
https://github.com/thesofproject/linux/pull/3209#issuecomment-941229502
and this patch fixes the issue.
I am all ears if someone has a better solution, but the problem is real.
I chose to add this patch first, before the BE lock is added in
dpcm_be_dai_trigger(), and if this causes problems already there are
even more issues in DPCM :-(
If this patch causes issues outside of the trigger phase, then maybe we
could just change the BE nonatomic flag temporarily and restore it after
taking the lock, but IMHO something else is broken in other drivers.
>> At this point it does not cause serious problems, but with subsequent
>> patches (especially when patch 7/13 is picked) I see failures. Please
>> refer to patch 7/13 thread for more details.
>>
>>
>> I am wondering if it is possible to only use locks internally for DPCM
>> state management and decouple BE callbacks from this, like normal PCMs
>> do?
>
> Actually the patch looks like an overkill by adding the FE stream lock
> at every loop, and this caused the problem, AFAIU.
>
> Basically we need to protect the link addition / deletion while the
> list traversal (there is a need for protection of BE vs BE access
> race, but that's a different code path). For the normal cases, it
> seems already protected by card->pcm_mutex, but the problem is the FE
> trigger case. It was attempted by dpcm_lock, but that failed because
> it couldn't be applied properly there.
>
> That said, what we'd need is only:
> - Drop dpcm_lock codes once
I am not able to follow this sentence, what did you mean here?
> - Put FE stream lock around dpcm_be_connect() and dpcm_be_disconnect()
>
> That should suffice for the race at trigger. The FE stream lock is
> already taken at trigger callback, and the rest list addition /
> deletion are called from different code paths without stream locks, so
> the explicit FE stream lock is needed there.
I am not able to follow what you meant after "the rest". This sentence
mentions the FE stream lock in two places, but the second is not clear
to me.
> In addition, a lock around dpcm_show_state() might be needed to be
> replaced with card->pcm_mutex, and we may need to revisit whether all
> other paths take card->pcm_mutex.
What happens if we show the state while a trigger happens? That's my
main concern with using two separate locks (pcm_mutex and FE stream
lock) to protect the same list, there are still windows of time where
the list is not protected.
same with the use of for_each_dpcm_be() in soc-compress, SH and FSL
drivers, there's no other mutex there.
My approach might have been overkill in some places, but it's comprehensive.
> Also, BE-vs-BE race can be protected by taking a BE lock inside
> dpcm_be_dai_trigger().
that's what I did, no?