Re: [PATCH] add delay between port write and port read

From: Maciej W. Rozycki (macro@xxxxxxxxxxxxxx)
Date: Tue Feb 26 2019 - 12:52:56 EST


Linus,

I have added you to this discussion in hope you can clear any uncertainty
there might be about MMIO ordering properties of the Alpha; a question for
you is towards the end of this e-mail.

On Thu, 21 Feb 2019, Mikulas Patocka wrote:

> Do you think that we should fix this by identifying hardware that needs
> the delays and adding the delays there?

First of all we need to correctly identify what is really going on there.

And I've had a look at our arch/alpha/kernel/io.c and
arch/alpha/include/asm/io.h as they stand now and AFAICT what they have is
completely broken and if it works anywhere, then by pure luck only, e.g.
because external bus accesses complete quickly enough for code execution
not to progress to a subsequent external bus access.

The thing is taking for example `ioreadX' and `iowriteX' we have `mb'
before a write and `mb' after a read. So if we do say (addresses are
arbitrary for illustration only):

iowrite8(0, 1);
ioread8(2);

then these effectively expand to chipset-specific:

mb();
foo_iowrite8(0, 1);
foo_ioread8(2);
mb();

Which means `foo_iowrite8' and `foo_ioread8' are unordered WRT each other
as there is no barrier between them. Worse yet for:

iowrite8(0, 1);
ioread8(1);

the `ioread8' operation may not even appear externally, as the CPU may
peek the value in the CPU's write buffer while the write is still pending.
A similar consideration can be done for `readX' and `writeX', as well as a
mixture between the two kinds of these accesses.

We do need to add a preceding `mb' to all the `*readX' accessors to
satisfy the strong ordering requirement WRT any preceding `*writeX'
access. We need that leading `mb' in `*_relaxed' accessors as well. And
we need to alias `mmiowb' to `wmb' (unless it's removed as I've seen
with Will's recent patches).

NB, the 82378ZB SIO adds a minimum of 5 ISA clocks between back-to-back
accesses from PCI to ISA, which AFAIU should be enough for the PC87332
Super I/O. More can be configured if required, but I doubt that is
needed, because it wasn't for 20+ years.

> In my opinion, adding mb() to the port accessing functions is safer - it
> is 6 line patch.

The amount of code does not matter as long as it is wrong. We need to
get this right.

> Reading all the hardware manuals is time consuming and hardly anyone will
> do it for 25 years old hardware.

I do this regularly and I'm fine doing it this time too, also for the
value of the education gained. I'll see if I can experiment with my
hardware too, but regrettably that'll have to wait a couple of months at
the very least (due to Brexit :( -- I have left my Alpha in the UK and the
very last weekend I literally headed off to stay away from all of the
upcoming mess).

Meanwhile here is what I have tracked down in the architecture manual[1]:

"Access to local I/O space does not cause any implicit read/write
ordering; explicit barrier instructions must be used to ensure any desired
ordering. Barrier instructions must be used:

* After updating a memory-resident data structure and before writing a
local I/O space location to notify the device of the updates.

* Between multiple consecutive direct accesses to local I/O space, e.g.
device control registers, if those accesses are expected to be ordered
at the device.

Again, note that implementations may cache not only memory-resident data
structures, but also local I/O space locations."

-- which in my opinion makes it clear that we need these barriers that
were removed with commit 92d7223a7423 ("alpha: io: reorder barriers to
guarantee writeX() and iowriteX() ordering #2"), in addition to ones added
there (the `*_relaxed' accessors don't need the latter ones though).

NB for some reason the whole chapter on I/O has been reduced in later
revisions of the architecure manual[2][3] (as well as the shortened
version of the book called the architecture handbook) to a single-page
overview lacking any details. I think that does not make the architecture
ordered any more strongly WRT MMIO though; instead it makes it even less
specified.

Linus, can you confirm or contradict my conclusions taken from the
manual in the context of their later removal?

You did the Alpha port back in mid-1990s (BTW, Jon 'maddog' spoke very
fondly about that effort at FOSDEM earlier this month) and certainly had
to know these details to complete it and you worked with people directly
involved with the design of the Alpha at the time it was being developed,
and given the somewhat terse formal specification you could therefore be
the only person around I know of who can give a definite answer. I hope
you still remember the bits.

Questions, thoughts?

References:

[1] Richard L. Sites, ed., "Alpha Architecture Reference Manual",
Digital Press, 1992, Chapter 8, "Input/Output (I)", Section 8.2.1
"Read/Write Ordering",
<http://www.bitsavers.org/pdf/dec/alpha/Sites_AlphaArchitectureReferenceManual_1992.pdf>.

[2] Richard L. Sites, Richard T. Witek, "Alpha AXP Architecture Reference
Manual", Second Edition, Digital Press, 1995, Chapter 8 "Input/Output
Overview (I)",
<http://www.bitsavers.org/pdf/dec/alpha/Sites_AlphaAXPArchitectureReferenceManual_2ed_1995.pdf>.

[3] "Alpha Architecture Reference Manual", Fourth Edition, Compaq Computer
Corporation, January 2002, Chapter 8 "Input/Output Overview (I)",
<https://download.majix.org/dec/alpha_arch_ref.pdf>.

Maciej