Re: [tip:locking/core] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

From: Peter Zijlstra
Date: Wed Sep 08 2021 - 07:44:37 EST


On Wed, Sep 08, 2021 at 01:00:26PM +0200, Peter Zijlstra wrote:
> On Tue, Oct 02, 2018 at 03:11:10AM -0700, tip-bot for Alan Stern wrote:
> > Commit-ID: 6e89e831a90172bc3d34ecbba52af5b9c4a447d1
> > Gitweb: https://git.kernel.org/tip/6e89e831a90172bc3d34ecbba52af5b9c4a447d1
> > Author: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> > AuthorDate: Wed, 26 Sep 2018 11:29:17 -0700
> > Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> > CommitDate: Tue, 2 Oct 2018 10:28:01 +0200
> >
> > tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire
> >
> > More than one kernel developer has expressed the opinion that the LKMM
> > should enforce ordering of writes by locking. In other words, given
> > the following code:
> >
> > WRITE_ONCE(x, 1);
> > spin_unlock(&s):
> > spin_lock(&s);
> > WRITE_ONCE(y, 1);
> >
> > the stores to x and y should be propagated in order to all other CPUs,
> > even though those other CPUs might not access the lock s. In terms of
> > the memory model, this means expanding the cumul-fence relation.
>
> Let me revive this old thread... recently we ran into the case:
>
> WRITE_ONCE(x, 1);
> spin_unlock(&s):
> spin_lock(&r);
> WRITE_ONCE(y, 1);
>
> which is distinct from the original in that UNLOCK and LOCK are not on
> the same variable.
>
> I'm arguing this should still be RCtso by reason of:
>
> spin_lock() requires an atomic-acquire which:
>
> TSO-arch) implies smp_mb()
> ARM64) is RCsc for any stlr/ldar
> Power) LWSYNC
> Risc-V) fence r , rw
> *) explicitly has smp_mb()
>
>
> However, Boqun points out that the memory model disagrees, per:
>
> https://lkml.kernel.org/r/YTI2UjKy+C7LeIf+@boqun-archlinux
>
> Is this an error/oversight of the memory model, or did I miss a subtlety
> somewhere?

Hmm.. that argument isn't strong enough for Risc-V if I read that FENCE
thing right. That's just R->RW ordering, which doesn't constrain the
first WRITE_ONCE().

However, that spin_unlock has "fence rw, w" with a subsequent write. So
the whole thing then becomes something like:


WRITE_ONCE(x, 1);
FENCE RW, W
WRITE_ONCE(s.lock, 0);
AMOSWAP %0, 1, r.lock
FENCE R, WR
WRITE_ONCE(y, 1);


Which I think is still sufficient, irrespective of the whole s!=r thing.