Re: LKMM: Read dependencies of writes ordered by dma_wmb()?

From: Paul E. McKenney
Date: Tue Aug 17 2021 - 09:53:10 EST


On Tue, Aug 17, 2021 at 01:28:16PM +0100, Will Deacon wrote:
> Just on this bit...
>
> On Mon, Aug 16, 2021 at 01:50:57PM -0700, Paul E. McKenney wrote:
> > 5. The dma_mb(), dma_rmb(), and dma_wmb() appear to be specific
> > to ARMv8.
>
> These are useful on other architectures too! IIRC, they were added by x86 in
> the first place. They're designed to be used with dma_alloc_coherent()
> allocations where you're sharing something like a ring buffer with a device
> and they guarantee accesses won't be reordered before they become visible
> to the device. They _also_ provide the same ordering to other CPUs.
>
> I gave a talk at LPC about some of this, which might help (or might make
> things worse...):
>
> https://www.youtube.com/watch?v=i6DayghhA8Q

The slides are here, correct? Nice summary and examples!

https://elinux.org/images/a/a8/Uh-oh-Its-IO-Ordering-Will-Deacon-Arm.pdf

And this is all I see for dma_mb():

arch/arm64/include/asm/barrier.h:#define dma_mb() dmb(osh)
arch/arm64/include/asm/io.h:#define __iomb() dma_mb()

And then for __iomb():

arch/arm64/include/asm/io.h:#define __iomb() dma_mb()
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c: __iomb();

But yes, dma_rmb() and dma_wmb() do look to have a few hundred uses
between them, and not just within ARMv8. I gave up too soon, so
thank you!

> Ignore the bits about mmiowb() as we got rid of that.

Should the leftovers in current mainline be replaced by wmb()? Or are
patches to that effect on their way in somewhere?

$ git grep 'mmiowb()'
arch/ia64/include/asm/mmiowb.h:#define mmiowb() ia64_mfa()
arch/ia64/include/asm/spinlock.h: mmiowb();
arch/mips/include/asm/mmiowb.h:#define mmiowb() iobarrier_w()
arch/mips/include/asm/spinlock.h: mmiowb();
arch/mips/kernel/gpio_txx9.c: mmiowb();
arch/mips/kernel/gpio_txx9.c: mmiowb();
arch/mips/kernel/gpio_txx9.c: mmiowb();
arch/mips/kernel/irq_txx9.c: mmiowb();
arch/mips/loongson2ef/common/bonito-irq.c: mmiowb();
arch/mips/loongson2ef/common/bonito-irq.c: mmiowb();
arch/mips/loongson2ef/common/mem.c: mmiowb();
arch/mips/loongson2ef/common/pm.c: mmiowb();
arch/mips/loongson2ef/lemote-2f/reset.c: mmiowb();
arch/mips/loongson2ef/lemote-2f/reset.c: mmiowb();
arch/mips/loongson2ef/lemote-2f/reset.c: mmiowb();
arch/mips/loongson2ef/lemote-2f/reset.c: mmiowb();
arch/mips/loongson2ef/lemote-2f/reset.c: mmiowb();
arch/mips/pci/ops-bonito64.c: mmiowb();
arch/mips/pci/ops-loongson2.c: mmiowb();
arch/mips/txx9/generic/irq_tx4939.c: mmiowb();
arch/mips/txx9/generic/setup.c: mmiowb();
arch/mips/txx9/rbtx4927/irq.c: mmiowb();
arch/mips/txx9/rbtx4938/irq.c: mmiowb();
arch/mips/txx9/rbtx4938/irq.c: mmiowb();
arch/mips/txx9/rbtx4938/setup.c: mmiowb();
arch/mips/txx9/rbtx4939/irq.c: mmiowb();
arch/powerpc/include/asm/mmiowb.h:#define mmiowb() mb()
arch/riscv/include/asm/mmiowb.h:#define mmiowb() __asm__ __volatile__ ("fence o,w" : : : "memory");
arch/s390/include/asm/io.h:#define mmiowb() zpci_barrier()
arch/sh/include/asm/mmiowb.h:#define mmiowb() wmb()
arch/sh/include/asm/spinlock-llsc.h: mmiowb();
include/asm-generic/mmiowb.h: * Generic implementation of mmiowb() tracking for spinlocks.
include/asm-generic/mmiowb.h: * 1. Implement mmiowb() (and arch_mmiowb_state() if you're fancy)
include/asm-generic/mmiowb.h: mmiowb();

Thanx, Paul