Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

From: Logan Gunthorpe
Date: Mon Mar 05 2018 - 20:14:24 EST




On 05/03/18 05:49 PM, Oliver wrote:
It's in arch/powerpc/kernel/io.c as _memcpy_toio() and it has two full barriers!

Awesome!

Our io.h indicates that our iomem accessors are designed to provide x86ish
strong ordering of accesses to MMIO space. The git log indicates
arch/powerpc/kernel/io.c has barely been touched in the last decade so
odds are most of that code was written in the elder days when people
were less aware of ordering issues. It might just be overly conservative
by today's standards, but maybe not (see below).

Yes, that seems overly conservative.

(I'm not going to suggest ditching the lwsync trick. mpe is not going
to take that patch
without a really good reason)

Well, that's pretty gross. Is this not exactly the situation mmiowb() is meant to solve? See [1].

Though, you're right in principle. Even if power was similar to other systems in this way, it's still a risk that if these pages get passed somewhere in the kernel that uses a spin lock like that without an mmiowb() call, then it's going to have a bug. For now, the risk is pretty low as we know exactly where all the p2pmem pages will be used but if it gets into other places, all bets are off. I did do some work trying to make a safe version of io-pages and also trying to change from pages to pfn_t in large areas but neither approach seemed likely to get any traction in the community, at least not in the near term.

Logan

[1] ACQUIRES VS I/O ACCESSES in https://www.kernel.org/doc/Documentation/memory-barriers.txt