Re: [PATCH 01/20] asm-generic/mmiowb: Add generic implementation of mmiowb() tracking

From: Nicholas Piggin
Date: Mon Mar 04 2019 - 19:21:36 EST


Linus Torvalds's on March 4, 2019 4:48 am:
> On Sun, Mar 3, 2019 at 2:05 AM Nicholas Piggin <npiggin@xxxxxxxxx> wrote:
>>
>> Why even bother with it at all, "internal" or not? Just get rid of
>> mmiowb, the concept is obsolete.
>
> It *is* gone, for chrissake! Only the name remains as an internal
> detail of "this is what we need to do".
>
>> Pretend ia64 doesn't exist for a minute. Now the regular mb/wmb barriers
>> orders IO across CPUs with respect to their cacheable accesses.
>
> Stop with the total red herring already.
>
> THIS HAS NOTHING TO DO WITH mb()/wmb().
>
> As long as you keep bringing those up, you're only showing that you're
> talking about the wrong thing.

Why? I'm talking about them because they are not taken care of by this
part of mmiowb removal. Talking about spin locks is the wrong thing
because we're already past that and everybody agrees it's the right
approach.

>> Regardless of whether that cacheable access is a spin lock, a bit lock,
>> an atomic, a mutex... This is how it was before mmiowb came along.
>
> No.
>
> Beflore mmiowb() came along, there was one rule: do what x86 does.
>
> And x86 orders mmio inside spinlocks.
>
> Seriously.
>
> Notice how there's not a single "barrier" mentioned here anywhere in
> the above. No "mb()", no "wmb()", no nothing. Only "spinlocks order
> IO".
>
> That's the fundamental rule (that we broke for ia64), and all that
> matters for this patch series.
>
> Stop talking about wmb(). It's irrelevant. A spinlock does not
> *contain* a wmb().

Well you don't have to talk about it but why do you want me to stop?
I don't understand. It's an open topic still after this series. I
can post a new thread about it if that would upset you less, I just
thought it would kind of fit here because we're talking about mmiowb,
I'm not trying to derail this series.

> Nobody even _cares_ about wmb(). They are entirely irrelevant wrt IO,
> because IO is ordered on any particular CPU anyway (which is what
> wmb() enforces).
>
> Only when you do special things like __raw_writel() etc does wmb()
> matter, but at that point this whole series is entirely irrelevant,
> and once again, that's still about just ordering on a single CPU.
>
> So as long as you talk about wmb(), all you show is that you're
> talking about something entirely different FROM THIS WHOLE SERIES.
>
> And like it or not, ia64 still exists. We support it. It doesn't
> _matter_ and we don't much care any more, but it still exists. Which
> is why we have that concept of mmiowb().
>
> On other platforms, mmiowb() might be a wmb(). Or it might not. It
> might be some other barrier, or it might be a no-op entirely without a
> barrier at all. It doesn't matter. But mmiowb() exists, and is now
> happily entirely hidden inside the rule of "spinlocks order MMIO
> across CPU's".

The driver writer still has to know exactly as much about mmiowb
(the concept, if not the name) before this series as afterward. That
is, sequences of mmio stores to a device from different CPUs can only
be atomic if you (put mmiowb before spin unlock | protect them with
spin locks).

I just don't understand the reason to expose the driver writer to
that additional detail. Intuitively, mb() should order stores to
all kind of memory the same as smp_mb() orders stores to cacheable
(without the detail of stores being reordered at the interconnect
or controller -- driver writer doesn't care about store queues in
the CPU or whatever details, they want the device to see IOs in
some order).

Thanks,
Nick