Re: [tip:locking/core] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

From: Boqun Feng
Date: Fri Sep 10 2021 - 10:17:12 EST

Next message: Arnaldo Carvalho de Melo: "Re: [PATCH 0/3] perf report: Add support to print a textual representation of IBS raw sample data"
Previous message: Michael Ellerman: "Re: [PATCH v1 01/13] perf/core: add union to struct perf_branch_entry"
In reply to: Dan Lustig: "Re: [tip:locking/core] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire"
Next in thread: Linus Torvalds: "Re: [tip:locking/core] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Sep 10, 2021 at 09:48:55AM -0400, Dan Lustig wrote:
> On 9/10/2021 6:04 AM, Boqun Feng wrote:
> > On Fri, Sep 10, 2021 at 11:33:25AM +0200, Peter Zijlstra wrote:
> >> On Fri, Sep 10, 2021 at 08:01:14AM +0800, Boqun Feng wrote:
> >>> On Thu, Sep 09, 2021 at 01:03:18PM -0400, Dan Lustig wrote:
> >>>> On 9/9/2021 9:35 AM, Will Deacon wrote:
> >>>>> On Thu, Sep 09, 2021 at 09:25:30AM +0200, Peter Zijlstra wrote:
> >>
> >>>>>> The AMOSWAP is a RmW and as such matches the W from the RW->W fence,
> >>>>>> similarly it marches the R from the R->RW fence, yielding an:
> >>>>>>
> >>>>>> RW-> W
> >>>>>> RmW
> >>>>>> R ->RW
> >>>>>>
> >>>>>> ordering. It's the stores S and R that can be re-ordered, but not the
> >>>>>> sections themselves (same on PowerPC and many others).
> >>
> >>>> I agree with Will here. If the AMOSWAP above is actually implemented with
> >>>> a RISC-V AMO, then the two critical sections will be separated as if RW,RW,
> >>>> as Peter described. If instead it's implemented using LR/SC, then RISC-V
> >>>
> >>> Just out of curiosity, in the following code, can the store S and load L
> >>> be reordered?
> >>>
> >>> WRITE_ONCE(x, 1); // store S
> >>> FENCE RW, W
> >>> WRITE_ONCE(s.lock, 0); // unlock(s)
> >>> AMOSWAP %0, 1, s.lock // lock(s)
> >>> FENCE R, RW
> >>> r1 = READ_ONCE(y); // load L
> >>>
> >>> I think they can, because neither "FENCE RW, W" nor "FENCE R, RW" order
> >>> them.
> >>
> >> I'm confused by your argument, per the above quoted section, those
> >> fences and the AMO combine into a RW,RW ordering which is (as per the
> >> later clarification) multi-copy-atomic, aka smp_mb().
> >>
> >
> > Right, my question is more about the reasoning about why fence rw,w +
> > AMO + fence r,rw act as a fence rw,rw.
>
> Is this a RISC-V question? If so, it's as simple as:

Yep, and thanks for the answer.

> 1) S and anything earlier are ordered before the AMO by the first fence
> 2) L and anything later are ordered after the AMO by the second fence
> 3) 1 + 2 = S and anything earlier are ordered before L or anything later
>
> Since RISC-V is multi-copy atomic, so 1+2 just naturally compose
> transitively.
>
> > Another related question, can
> > fence rw,w + store + fence w,rw act as a fence rw,rw by the similar
> > reasoning? IOW, will the two loads in the following be reordered?
> >
> > r1 = READ_ONCE(x);
> > FENCE RW, W
> > WRITE_ONCE(z, 1);
> > FENCE W, RW
> > r2 = READ_ONCE(y);
> >
> > again, this is more like a question out of curiosity, not that I find
> > this pattern is useful.
>
> Does FENCE W,RW appear in some actual use case? But yes, if it does

I'm not aware of any, but probably because no other arch can order
write->read without a full barrier (or release+acquire if RCsc), we have
a few patterns in kernel where we only want to order write->read, and
smp_mb()s are used, if on RISCV FENCE W,R is cheaper than FENCE RW,RW,
then *in theory* we can have smp_wrmb() implemented as FENCE W,R on
RISCV and smp_mb() on other archs.

/me run

And I'm sure there are cases that we use smp_mb() where only
write->{read,write} is supposed to be ordered, so there may be use case
by the same reason.

I'm not proposing doing anything, just saying we don't use FENCE W,RW
because there is no equilavent concept in other archs, so it's not
modeled by an API. Besides, it may not be cheaper than FENCE RW,RW on
RISCV.

Regards,
Boqun

> appear, this sequence would also act as a FENCE RW,RW on RISC-V.
>
> Dan
>
> > Regards,
> > Boqun
> >
> >> As such, S and L are not allowed to be re-ordered in the given scenario.
> >>
> >>> Note that the reordering is allowed in LKMM, because unlock-lock
> >>> only need to be as strong as RCtso.
> >>
> >> Risc-V is strictly stronger than required in this instance. Given the
> >> current lock implementation. Daniel pointed out that if the atomic op
> >> were LL/SC based instead of AMO it would end up being RCtso.
> >>

Next message: Arnaldo Carvalho de Melo: "Re: [PATCH 0/3] perf report: Add support to print a textual representation of IBS raw sample data"
Previous message: Michael Ellerman: "Re: [PATCH v1 01/13] perf/core: add union to struct perf_branch_entry"
In reply to: Dan Lustig: "Re: [tip:locking/core] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire"
Next in thread: Linus Torvalds: "Re: [tip:locking/core] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]