RE: [PATCH 2/6] Revert "arm64: dts: renesas: r8a7796: Enable DMA for SCIF2"
From: Yoshihiro Shimoda
Date: Mon May 20 2019 - 05:55:15 EST
Hi Geert-san,
Thank you for your reply!
> From: Geert Uytterhoeven, Sent: Monday, May 20, 2019 4:38 PM
>
> Hi Shimoda-san,
>
> Thanks for your analysis!
>
> On Mon, May 20, 2019 at 4:18 AM Yoshihiro Shimoda
> <yoshihiro.shimoda.uh@xxxxxxxxxxx> wrote:
> > > From: Eugeniu Rosca, Sent: Tuesday, May 7, 2019 4:43 AM
> > <snip>
> > > > > [0] v5.0-rc6 commit 97f26702bc95b5 ("arm64: dts: renesas: r8a7796: Enable DMA for SCIF2")
> > > > > [1] v4.14.106 commit 703db5d1b1759f ("arm64: dts: renesas: r8a7796: Enable DMA for SCIF2")
> > > > > [2] scif (DEBUG) and rcar-dmac logs:
> > > > > https://gist.github.com/erosca/132cce76a619724a9e4fa61d1db88c66
> > <snip>
> > > Enabling DEBUG in drivers/dma/sh/rcar-dmac.c, I can notice that one of
> > > the symptoms is a NULL dst_addr revealed by:
> > >
> > > rcar-dmac e7300000.dma-controller: chan0: queue chunk (____ptrval____): 0@0xffff800639eb8090 -> 0x0000000000000000
> > >
> > > In working scenarios, dst_addr is never zero. Does it give any hints?
> >
> > Thank you for the report! It's very helpful to me.
> > I think we should fix the sh-sci driver at least.
> >
> > According to the [2] log above,
> >
> > [ 4.379716] sh-sci e6e88000.serial: sci_dma_tx_work_fn: ffff800639b55000: 0...0, cookie 126
> >
> > This "0...0" means the s->tx_dma_len on the sci_dma_tx_work_fn will be zero. And,
>
> How can this happen? schedule_work(&s->work_tx) is called only if
> !uart_circ_empty(), and while holding the port lock? So the circular
> buffer must be made empty in between the call to schedule_work() and the
> work function sci_dma_tx_work_fn() being called.
>
> I think this can happen if uart_flush_buffer() is called at the right
> moment?
I think so. According to the log [2], the xmit->head and tail is set to zero.
278 [ 4.331234] sh-sci e6e88000.serial: sci_dma_tx_work_fn: ffff800639b55000: 9...52, cookie 124
279 [ 4.334885] sh-sci e6e88000.serial: sci_dma_tx_complete(0)
280 [ 4.339992] sh-sci e6e88000.serial: sci_dma_tx_work_fn: ffff800639b55000: 52...100, cookie 125
281 [ 4.343340] sh-sci e6e88000.serial: sci_dma_tx_complete(0)
282 [ 4.379716] sh-sci e6e88000.serial: sci_dma_tx_work_fn: ffff800639b55000: 0...0, cookie 126
> > > rcar-dmac e7300000.dma-controller: chan0: queue chunk (____ptrval____): 0@0xffff800639eb8090 -> 0x0000000000000000
> >
> > This means the chunk->dst_addr is not set to the "dst_addr" for SCIF because the len on rcar_dmac_chan_prep_sg is zero.
> > So, I'm thinking:
> > - we have to fix the sh_sci driver to avoid "tx_dma_len = 0" transferring.
>
> That sounds like just a simple check for !s->tx_dma_len in
> sci_dma_tx_work_fn(), to return early, _and_ reset s->cookie_tx to
> -EINVAL.
>
> However, uart_flush_buffer() may still be called in between the check
> and the calls to dmaengine_prep_slave_single() /
> dma_sync_single_for_device(), clearing s->tx_dma_len again.
> Unless something has changed recently, these two calls cannot be moved
> inside the spinlock-protected section?
I also think these two calls (and dmaengine_submit() and dma_async_issue_pending())
should be moved inside the spinlock-protected section like sci_dma_rx_complete().
Also, sci_flush_buffer() should have the spinlock-protected section and
check the xmit and dma state somehow.
> Using a cached value of s->tx_dma_len for the dmaengine calls might
> work, though.
>
> > and
> >
> > - also we have to fix the rcar-dmac driver to avoid this issue because the DMA Engine API
> > guide doesn't prevent the len = 0.
>
> I guess returning an error makes most sense?
> Else we have to fix it deeper into the driver, where handling becomes
> more complex.
I see. I think so. (We should avoid more complex.)
Best regards,
Yoshihiro Shimoda