Re: [PATCH] dmaengine: pl330: Fix race in residue reporting

From: Sjoerd Simons
Date: Mon Nov 09 2015 - 08:28:20 EST


On Sat, 2015-11-07 at 21:40 +0900, Krzysztof Kozlowski wrote:
> W dniu 06.11.2015 o 20:11, Sjoerd Simons pisze:
> > When a transfer completes there is a small window between the
> > descriptor
> > being unset as the current active one in the thread and it being
> > marked
> > as done. This causes the residue to be incorrectly set when
> > pl330_tx_status is run in that window. Practically this caused
> > issue for me with audio playback as the residue goes up during a
> > transfer (as the in-progress transfer is no longer accounted for),
> > which makes the higher levels think the audio ringbuffer wrapped
> > around
> > and thus did a sudden big jump forward.
> >
> > To resolve this, add a field protected by the dma engine lock to
> > indicate the transfer is done but the status hasn't been updated
> > yet.
> >
> > Also fix the locking in pl330_tx_status, as the function inspects
> > the
> > threads req_running field and queries the dma engine for the
> > current
> > state of the running transfer the dma engine lock needs to be held
> > to
> > ensure the active descriptor doesn't change underneath it.
> >
> > Signed-off-by: Sjoerd Simons <sjoerd.simons@xxxxxxxxxxxxxxx>
> >
> > ---
> >
> > Âdrivers/dma/pl330.c | 10 +++++++++-
> > Â1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
> > index 17ee758..6c8243b 100644
> > --- a/drivers/dma/pl330.c
> > +++ b/drivers/dma/pl330.c
> > @@ -503,6 +503,8 @@ struct dma_pl330_desc {
> > Â struct pl330_reqcfg rqcfg;
> > Â
> > Â enum desc_status status;
> > + /* Transfer completed, but not yet moved to DONE state */
> > + bool xferred;
> > Â
> > Â int bytes_requested;
> > Â bool last;
> > @@ -1463,6 +1465,9 @@ static void dma_pl330_rqcb(struct
> > dma_pl330_desc *desc, enum pl330_op_err err)
> > Â spin_lock_irqsave(&pch->lock, flags);
> > Â
> > Â desc->status = DONE;
> > + spin_lock(&pch->thread->dmac->lock);
> > + desc->xferred = false;
> > + spin_unlock(&pch->thread->dmac->lock);
> > Â
> > Â spin_unlock_irqrestore(&pch->lock, flags);
> > Â
> > @@ -1595,6 +1600,7 @@ static int pl330_update(struct pl330_dmac
> > *pl330)
> > Â
> > Â /* Detach the req */
> > Â descdone = thrd->req[active].desc;
> > + descdone->xferred = true;
> > Â thrd->req[active].desc = NULL;
>
> Looking at the code indeed the small window could happen. How can I
> reproduce it? Can you describe your system?

I'm using a Rock 2 square (RK3288 based) board, running Debian Jessie
and simply playing back audio using vlc & pulseaudio.

For some reason when using vlc, pulseaudio is polling the hw pointer
position very very often and seems to hit this tiny window a few times
a minute depending a bit on system load. This causes some audiable
issues and eventually underflowing (as vlc fills the pulseaudio buffer
at a constant rate, but due to this, pulse pushes the data faster then
the actual playback rate every so often).

> As for the change itself, how about adding a new value to
> desc_status?
> Now you are actually introducing a semi-status field.

The reason i picked this way this was was due to the current locking
scheme. The locking order is channel, then controller (when locking
both). While processing the events (and update the active descriptor),
the controller is locked so the status can't be directly updated (as
that's protected by the channel lock). And we can't grab the channel
lock without dopping the controller one first :).

Keeping one status field might well be possible, but would require at
least reworking the locking here.



> Best regards,
> Krzysztof
>
> > Â
> > Â thrd->req_running = -1;
> > @@ -2250,13 +2256,14 @@ pl330_tx_status(struct dma_chan *chan,
> > dma_cookie_t cookie,
> > Â goto out;
> > Â
> > Â spin_lock_irqsave(&pch->lock, flags);
> > + spin_lock(&pch->thread->dmac->lock);
> > Â
> > Â if (pch->thread->req_running != -1)
> > Â running = pch->thread->req[pch->thread-
> > >req_running].desc;
> > Â
> > Â /* Check in pending list */
> > Â list_for_each_entry(desc, &pch->work_list, node) {
> > - if (desc->status == DONE)
> > + if (desc->xferred || desc->status == DONE)
> > Â transferred = desc->bytes_requested;
> > Â else if (running && desc == running)
> > Â transferred =
> > @@ -2281,6 +2288,7 @@ pl330_tx_status(struct dma_chan *chan,
> > dma_cookie_t cookie,
> > Â if (desc->last)
> > Â residual = 0;
> > Â }
> > + spin_unlock(&pch->thread->dmac->lock);
> > Â spin_unlock_irqrestore(&pch->lock, flags);
> > Â
> > Âout:
> >
>

--
Sjoerd Simons
Collabora Ltd.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/