Re: [PATCH] net: ftgmac100: Fix missing TX-poll issue
From: Benjamin Herrenschmidt
Date: Tue Oct 20 2020 - 18:06:29 EST
On Tue, 2020-10-20 at 13:15 +0000, David Laight wrote:
> That rather depends where the data is 'stuck'.
>
> An old sparc cpu would flush the cpu store buffer before the read.
> But a modern x86 cpu will satisfy the read from the store buffer
> for cached data.
>
> If the write is 'posted' on a PCI(e) bus then the read can't overtake it.
> But that is a memory access so shouldn't be to a PCI(e) address.
>
> Shouldn't dma_wb() actually force your 'cpu to dram' queue be flushed?
> In which case you need one after writing the ring descriptor and
> before the poke of the mac engine.
>
> The barrier before the descriptor write only needs to guarantee
> ordering of the writes - it can probably be a lighter barrier?
>
> It might be that your dma_wmb() needs to do a write+read of
> an uncached DRAM location in order to empty the cpu to dram queue.
This is a specific bug with how a specific IP block is hooked up in
those SOCs, I wouldn't bloat the global dma_wmb for that. The read back
in the driver with appropriate comment should be enough.
Cheers,
Ben.