Re: [RFC PATCH 11/20] ia64: Add unconditional mmiowb() to arch_spin_unlock()

From: Nicholas Piggin
Date: Tue Feb 26 2019 - 23:40:13 EST


Will Deacon's on February 23, 2019 4:50 am:
> The mmiowb() macro is horribly difficult to use and drivers will continue
> to work most of the time if they omit a call when it is required.
>
> Rather than rely on driver authors getting this right, push mmiowb() into
> arch_spin_unlock() for ia64. If this is deemed to be a performance issue,
> a subsequent optimisation could make use of ARCH_HAS_MMIOWB to elide
> the barrier in cases where no I/O writes were performned inside the
> critical section.

mmiowb() was always the wrong approach. IIRC what happened is that an
ia64 platform found that real wmb() semantics were too expensive, so
they kind of "relaxed" it, breaking everything, and then said drivers
that wanted to unbreak themselves had to add these mmiowb() in.

The right way to go of course would have been to implement wmb()
the way existing drivers expected, and add a faster io_wmb() that
only ordered mmio stores from the CPU added to the few drivers that
the platform cared about.

I think it was argued the wmb() was still technically correct because
the reordering did not happen at the CPU, but somewhere else in the
interconnect or PCI controller. But that was just a crazy burden to
put on driver writers, and it was why the documentation was always
incomprehensible.

Not sure why Linus ever went along with it, but awesome you're removing
it. Thank you!

Thanks,
Nick