Re: [PATCH 2/6] Revert "arm64: dts: renesas: r8a7796: Enable DMA for SCIF2"

From: Geert Uytterhoeven
Date: Mon May 20 2019 - 03:40:18 EST


Hi Shimoda-san,

Thanks for your analysis!

On Mon, May 20, 2019 at 4:18 AM Yoshihiro Shimoda
<yoshihiro.shimoda.uh@xxxxxxxxxxx> wrote:
> > From: Eugeniu Rosca, Sent: Tuesday, May 7, 2019 4:43 AM
> <snip>
> > > > [0] v5.0-rc6 commit 97f26702bc95b5 ("arm64: dts: renesas: r8a7796: Enable DMA for SCIF2")
> > > > [1] v4.14.106 commit 703db5d1b1759f ("arm64: dts: renesas: r8a7796: Enable DMA for SCIF2")
> > > > [2] scif (DEBUG) and rcar-dmac logs:
> > > > https://gist.github.com/erosca/132cce76a619724a9e4fa61d1db88c66
> <snip>
> > Enabling DEBUG in drivers/dma/sh/rcar-dmac.c, I can notice that one of
> > the symptoms is a NULL dst_addr revealed by:
> >
> > rcar-dmac e7300000.dma-controller: chan0: queue chunk (____ptrval____): 0@0xffff800639eb8090 -> 0x0000000000000000
> >
> > In working scenarios, dst_addr is never zero. Does it give any hints?
>
> Thank you for the report! It's very helpful to me.
> I think we should fix the sh-sci driver at least.
>
> According to the [2] log above,
>
> [ 4.379716] sh-sci e6e88000.serial: sci_dma_tx_work_fn: ffff800639b55000: 0...0, cookie 126
>
> This "0...0" means the s->tx_dma_len on the sci_dma_tx_work_fn will be zero. And,

How can this happen? schedule_work(&s->work_tx) is called only if
!uart_circ_empty(), and while holding the port lock? So the circular
buffer must be made empty in between the call to schedule_work() and the
work function sci_dma_tx_work_fn() being called.

I think this can happen if uart_flush_buffer() is called at the right
moment?

> > rcar-dmac e7300000.dma-controller: chan0: queue chunk (____ptrval____): 0@0xffff800639eb8090 -> 0x0000000000000000
>
> This means the chunk->dst_addr is not set to the "dst_addr" for SCIF because the len on rcar_dmac_chan_prep_sg is zero.
> So, I'm thinking:
> - we have to fix the sh_sci driver to avoid "tx_dma_len = 0" transferring.

That sounds like just a simple check for !s->tx_dma_len in
sci_dma_tx_work_fn(), to return early, _and_ reset s->cookie_tx to
-EINVAL.

However, uart_flush_buffer() may still be called in between the check
and the calls to dmaengine_prep_slave_single() /
dma_sync_single_for_device(), clearing s->tx_dma_len again.
Unless something has changed recently, these two calls cannot be moved
inside the spinlock-protected section?
Using a cached value of s->tx_dma_len for the dmaengine calls might
work, though.

> and
>
> - also we have to fix the rcar-dmac driver to avoid this issue because the DMA Engine API
> guide doesn't prevent the len = 0.

I guess returning an error makes most sense?
Else we have to fix it deeper into the driver, where handling becomes
more complex.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds