Re: [PATCH] add delay between port write and port read

From: Will Deacon (will.deacon@xxxxxxx)
Date: Tue Feb 26 2019 - 13:38:01 EST

On Tue, Feb 26, 2019 at 10:12:03AM -0800, Linus Torvalds wrote:
> On Tue, Feb 26, 2019 at 9:52 AM Maciej W. Rozycki <macro@xxxxxxxxxxxxxx> wrote:
> >
> > The thing is taking for example `ioreadX' and `iowriteX' we have `mb'
> > before a write and `mb' after a read. So if we do say (addresses are
> > arbitrary for illustration only):
> >
> > iowrite8(0, 1);
> > ioread8(2);
> >
> > then these effectively expand to chipset-specific:
> >
> > mb();
> > foo_iowrite8(0, 1);
> > foo_ioread8(2);
> > mb();
> Yeah, looking at the current alpha io.c file, I'd suggest that all IO
> operations just do the "mb()" _before_ doing the op.
> That way all IO ops are at least ordered wrt each other, any any IO
> ops are ordered wrt any memory writes before it (in particular the
> "people wrote to memory, and then did a IO write to start DMA" case).
> Also, since "spin_unlock()" on alpha has a memory barrier before
> releasing the lock, it means that IO ends up being ordered wrt locks
> too.
> Does it mean that a "ioread()" can then be delayed until after later
> non-IO memory operations? Yes. But that seems harmless
> That said, I didn't spend a _ton_ of time thinking about this and
> looking at the code, but it does seem like the simplest model really
> is: "just always do a mb() before any IO ops". I think the "order IO
> wrt each other, and wrt any _preceding_ memory traffic (and all
> locking) is the important part, and that would seem to guarantee it.
> Obviously there can then be other re-ordering at the IO layer level
> (ie posted writes and IO being re-ordered across different devices
> etc), but that's universal and not alpha-specific.
> Does anybody see any worries with the "just move the mb() earlier in
> the ioread functions" model?

That makes sense to me for this Alpha-specific case, but in general I
don't think we require that I/O accesses to different endpoints are
ordered with respect to each other, which was implied by one of Maciej's
examples. e.g.

writeb(0x42, &foo_mmio_reg);

are not ordered with respect to each other (and I think if you wanted
this ordering in general you're going to need more than just fences).

I've been trying to write this up in memory-barriers.txt, so if anybody
has comments or concerns I'm keen to hear them, especially if there's
a feeling that we should be providing stronger guarantees for port I/O: