Re: [PATCH] ASoC: dapm: Check for NULL widget in dapm_update_dai_unlocked

From: Charles Keepax
Date: Wed Feb 06 2019 - 06:46:32 EST


On Wed, Feb 06, 2019 at 12:14:56PM +0100, Krzysztof Kozlowski wrote:
> On Wed, 6 Feb 2019 at 12:00, Charles Keepax
> <ckeepax@xxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Wed, Feb 06, 2019 at 11:22:33AM +0100, Krzysztof Kozlowski wrote:
> > > On Wed, 6 Feb 2019 at 11:05, Charles Keepax
> > > <ckeepax@xxxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > DAIs linked to the dummy will not have an associated playback/capture
> > > > widget, so we need to skip the update in that case.
> > > >
> > > > Fixes: 078a85f2806f ("ASoC: dapm: Only power up active channels from a DAI")
> > > > Signed-off-by: Charles Keepax <ckeepax@xxxxxxxxxxxxxxxxxxxxx>
> > > > ---
> > > >
> > > > Ok so that all makes sense, this patch is probably the best fix?
> > > >
> > > > Thanks,
> > > > Charles
> > >
> > > For this particular issue (NULL-pointer):
> > > Reported-by: Krzysztof Kozlowski <krzk@xxxxxxxxxx>
> > > Tested-by: Krzysztof Kozlowski <krzk@xxxxxxxxxx>
> > >
> > > However now I see bug sleeping in atomic context:
> > >
> > > [ 64.000828] BUG: sleeping function called from invalid context at
> > > ../kernel/locking/mutex.c:908
> >
> > Does this probably definitely get fixed by reverting my patch?
> > It's just a bit odd as this seems to be complaining about a clock
> > operation in i2s_trigger and I don't think my patch should have
> > any affect on the trigger callback. It should get run from either
> > the dai_link DAPM event or hw_params, neither of which should
> > happen in an atomic context.
>
> Before this fixup, probably NULL pointer happened before any of this.
> I tried it now few times and the possible deadlock and sleeping in
> invalid context did not appear. It might be random/racy or totally
> unrelated to your change.
>

Looking through I think this is an unrelated issue. Assuming the
driver in question is (sound/soc/samsung/i2s.c). Inside
i2s_trigger, there is a call to config_setup(i2s), which in turn
will call clk_get_rate if i2s->quirks contains the flag
QUIRK_NO_MUXPSR.

The trigger callback can be made from an atomic context and
clk_get_rate will take the prepare lock of the clock. The clock
prepare lock is always a mutex which shouldn't be locked from an
atomic context. So it seems like this will fail whenever that
QUIRK_NO_MUXPSR flag is set, no idea what causes that to be set.

It looks like this bug was introduced by this change:

647d04f8e07a ("ASoC: samsung: i2s: Ensure the RCLK rate is properly determined")

Thanks,
Charles