Re: [PATCH] refcount_t: add ACQUIRE ordering on success for dec(sub)_and_test variants
From: Andrea Parri
Date: Tue Jan 29 2019 - 22:33:44 EST
> So, you are saying that ACQUIRE does not guarantee that "po-later stores
> on the same CPU and all propagated stores from other CPUs
> must propagate to all other CPUs after the acquire operation "?
> I was reading about acquire before posting this and trying to understand,
> and this was my conclusion that it should provide this, but I can easily be wrong
> on this.
>
> Andrea, Peter, could you please comment?
Short version: I am not convinced by the above sentence, and I suggest
to remove it (as done in
20190128142910.GA7232@andrea">http://lkml.kernel.org/r/20190128142910.GA7232@andrea ).
---
To elaborate: I think that we should first discuss the meaning of that
"[...] after the acquire operation (does)", because there is no notion
of "ACQUIRE (or more generally, load) propagation" in the LKMM:
Stores propagate (after being executed) to other CPUs. Loads _execute_
(possibly multiple times /speculatively, but this is irrelevant for the
discussion below).
A detailed, but still informal, description of these concepts is in:
tools/memory-model/Documentation/explanation.txt
(c.f., in particular, section "AN OPERATIONAL MODEL"); I can illustrate
them with an example:
{ initially: x=0, y=0; }
CPU0 CPU1
--------------------------------------
LOAD-ACQUIRE x=0 LOAD y=1
STORE y=1
In this scenario,
a) CPU0's "LOAD-ACQUIRE x=0" executes before CPU0's "STORE y=1"
executes (this is guaranteed by the ACQUIRE),
b) CPU0's "STORE y=1" executes before "STORE y=1" propagates to
CPU1 (a store cannot be propagated before being executed),
c) CPU0's "STORE y=1" propagates to CPU1 before CPU1's "LOAD y=1"
executes (since CPU1 "sees the store").
The example also illustrates the following property:
ACQUIRE guarantees that po-later stores on the same CPU must
propagate to all other CPUs after the acquire _executes_.
(combine (a) and (b) ).
OTOH, please notice that:
ACQUIRE does _NOT_ guarantee that all propagated stores from
other CPUs (to the CPU executing the ACQUIRE) must propagate
to all other CPUs after the acquire operation _executes_.
In fact, we've already seen how full barriers can be used to break such
"guarantee"; for example, in
{ initially: x=0, y=0; }
CPU0 CPU1 ...
---------------------------------------------------
STORE x=1 LOAD x=1
FULL-BARRIER
LOAD-ACQUIRE y=0
the full barrier forces CPU0's "STORE x=1" (seen by/propagated to CPU1)
to be propagated to all CPUs _before_ "LOAD-ACQUIRE y=0" is executed.
Does this make sense?
> > Is ACQUIRE strictly stronger than control dependency?
>
> In my understanding yes.
+1 (or we have a problem)
>
> > It generally looks so unless there is something very subtle that I am
> > missing. If so, should we replace it with just "RELEASE ordering +
> > ACQUIRE ordering on success"? Looks simpler with less magic trickery.
>
> I was just trying to mention all the applicable orderings/guarantees.
> I can remove "control dependency" part if it is easier for people to understand
> (the main goal of documentation).
This sounds like a good idea; thank you, Dmitry, for pointing this out.
Andrea
>
> Best Regards,
> Elena.