Re: [PATCH 23/25] ALSA/dummy: Replace tasklet with softirq hrtimer

From: Takashi Iwai
Date: Fri Sep 01 2017 - 07:58:49 EST


On Fri, 01 Sep 2017 12:25:37 +0200,
Takashi Sakamoto wrote:
>
> Hi,
>
> On Sep 1 2017 00:36, Takashi Iwai wrote:
> > I gave it at try, but it caused a kernel hang, unfortunately.
> >
> > The reason is that snd_pcm_period_elapased() may stop the stream
> > (e.g. when reaching at the end). With this patchset, it'll lead to
> > the call of hrtimer_cancel() from the hrtimer callback itself, thus it
> > stalls.
>
> I can reproduce this bug.
>
> > Below is the additional fix over your patch for working around it.
> > I believe it should cover most corner cases, and seems working fine
> > through quick tests, so far.
>
> This patch looks good to me, too. But I have an alternative.
>
> We can use 'hrtimer_callback_running()' to detect whether to be on hrtimer
> callback or not (please read '__run_hrtimer()' in 'kernel/time/hrtimer.c').

A good point, this is a better choice.

> Usage of this helper function on .stop callback to skip cancellation can
> avoid the stall. In this case, after stopping PCM substream, the hrtimer
> callback should return HRTIMER_NORESTART to avoid restarting, as well as
> your patch. Please test a patch in this message.
>
> > ---
> > diff --git a/sound/drivers/dummy.c b/sound/drivers/dummy.c
> > index 273d60c42125..b5dd64e3dab1 100644
> > --- a/sound/drivers/dummy.c
> > +++ b/sound/drivers/dummy.c
> > @@ -375,6 +375,7 @@ struct dummy_hrtimer_pcm {
> > ktime_t base_time;
> > ktime_t period_time;
> > atomic_t running;
> > + atomic_t callback_running;
> > struct hrtimer timer;
> > struct snd_pcm_substream *substream;
> > };
> > @@ -387,8 +388,15 @@ static enum hrtimer_restart dummy_hrtimer_callback(struct hrtimer *timer)
> > if (!atomic_read(&dpcm->running))
> > return HRTIMER_NORESTART;
> >
> > + atomic_inc(&dpcm->callback_running);
> > snd_pcm_period_elapsed(dpcm->substream);
> > + atomic_dec(&dpcm->callback_running);
> > + /* may be flipped during snd_pcm_period_elapsed() */
> > + if (!atomic_read(&dpcm->running))
> > + return HRTIMER_NORESTART;
> > +
> > hrtimer_forward_now(timer, dpcm->period_time);
> > + atomic_dec(&dpcm->callback_running);
> > return HRTIMER_RESTART;
> > }
> >
> > @@ -407,7 +415,9 @@ static int dummy_hrtimer_stop(struct snd_pcm_substream *substream)
> > struct dummy_hrtimer_pcm *dpcm = substream->runtime->private_data;
> >
> > atomic_set(&dpcm->running, 0);
> > - hrtimer_cancel(&dpcm->timer);
> > + /* issue hrtimer_cancel() only when called outside the callback */
> > + if (!atomic_read(&dpcm->callback_running))
> > + hrtimer_cancel(&dpcm->timer);
> > return 0;
> > }
> >
> > @@ -462,6 +472,7 @@ static int dummy_hrtimer_create(struct snd_pcm_substream *substream)
> > dpcm->timer.function = dummy_hrtimer_callback;
> > dpcm->substream = substream;
> > atomic_set(&dpcm->running, 0);
> > + atomic_set(&dpcm->callback_running, 0);
> > return 0;
> > }
>
> >From 07d61ba2a1c0e06e914443225e194d99f2d8c58d Mon Sep 17 00:00:00 2001
> From: Takashi Sakamoto <o-takashi@xxxxxxxxxxxxx>
> Date: Fri, 1 Sep 2017 19:10:18 +0900
> Subject: [PATCH] ALSA: dummy: avoid stall due to a call of hrtimer_cancel() on
> a callback of hrtimer
>
> A call of 'htrimer_cancel()' on a callback of hrtimer brings endless loop
> because 'struct hrtimer_clock_base.running' is not NULL on the callback.
> In hrtimer subsystem, this member is used to indicate the instance of
> hrtimer gets callbacks and there's a helper function,
> 'hrtimer_callback_running()' to check it.
>
> ALSA dummy driver uses hrtimer to emulate hardware interrupt per period
> of PCM buffer. When XRUN occurs on PCM substream, in a call of
> 'snd_pcm_period_elapsed()', 'struct snd_pcm_ops.stop()' is called to
> stop the substream. In current implementation, 'hrtimer_cancel()' is
> used to wait for cancellation of hrtimer. However, as described, this
> brings endless loop.

It's not only about XRUN. When the stream finishes the draining, it
stops the stream gracefully -- that is the very normal operation.

> For this problem, this commit uses 'hrtimer_callback_running()' to
> detect whether to be on a callback of hrtimer or not, then skip
> cancellation of hrtimer in hrtimer callbacks. Furthermore, at a case of
> XRUN, hrtimer callback returns HRTIMER_NORESTART after a call of
> 'snd_pcm_period_elapsed()' to discontinue hrtimr because cancellation is
> skipped.
>
> Signed-off-by: Takashi Sakamoto <o-takashi@xxxxxxxxxxxxx>

It's better to fold the fix into the original patch instead of
introducing a bug and fixing it.


Takashi