Re: [PATCH] refcount_t: add ACQUIRE ordering on success for dec(sub)_and_test variants

From: Andrea Parri
Date: Tue Jan 29 2019 - 22:33:44 EST


> So, you are saying that ACQUIRE does not guarantee that "po-later stores
> on the same CPU and all propagated stores from other CPUs
> must propagate to all other CPUs after the acquire operation "?
> I was reading about acquire before posting this and trying to understand,
> and this was my conclusion that it should provide this, but I can easily be wrong
> on this.
>
> Andrea, Peter, could you please comment?

Short version: I am not convinced by the above sentence, and I suggest
to remove it (as done in

20190128142910.GA7232@andrea">http://lkml.kernel.org/r/20190128142910.GA7232@andrea ).

---
To elaborate: I think that we should first discuss the meaning of that
"[...] after the acquire operation (does)", because there is no notion
of "ACQUIRE (or more generally, load) propagation" in the LKMM:

Stores propagate (after being executed) to other CPUs. Loads _execute_
(possibly multiple times /speculatively, but this is irrelevant for the
discussion below).

A detailed, but still informal, description of these concepts is in:

tools/memory-model/Documentation/explanation.txt

(c.f., in particular, section "AN OPERATIONAL MODEL"); I can illustrate
them with an example:

{ initially: x=0, y=0; }

CPU0 CPU1
--------------------------------------
LOAD-ACQUIRE x=0 LOAD y=1
STORE y=1

In this scenario,

a) CPU0's "LOAD-ACQUIRE x=0" executes before CPU0's "STORE y=1"
executes (this is guaranteed by the ACQUIRE),

b) CPU0's "STORE y=1" executes before "STORE y=1" propagates to
CPU1 (a store cannot be propagated before being executed),

c) CPU0's "STORE y=1" propagates to CPU1 before CPU1's "LOAD y=1"
executes (since CPU1 "sees the store").

The example also illustrates the following property:

ACQUIRE guarantees that po-later stores on the same CPU must
propagate to all other CPUs after the acquire _executes_.

(combine (a) and (b) ).

OTOH, please notice that:

ACQUIRE does _NOT_ guarantee that all propagated stores from
other CPUs (to the CPU executing the ACQUIRE) must propagate
to all other CPUs after the acquire operation _executes_.

In fact, we've already seen how full barriers can be used to break such
"guarantee"; for example, in

{ initially: x=0, y=0; }

CPU0 CPU1 ...
---------------------------------------------------
STORE x=1 LOAD x=1
FULL-BARRIER
LOAD-ACQUIRE y=0

the full barrier forces CPU0's "STORE x=1" (seen by/propagated to CPU1)
to be propagated to all CPUs _before_ "LOAD-ACQUIRE y=0" is executed.

Does this make sense?


> > Is ACQUIRE strictly stronger than control dependency?
>
> In my understanding yes.

+1 (or we have a problem)


>
> > It generally looks so unless there is something very subtle that I am
> > missing. If so, should we replace it with just "RELEASE ordering +
> > ACQUIRE ordering on success"? Looks simpler with less magic trickery.
>
> I was just trying to mention all the applicable orderings/guarantees.
> I can remove "control dependency" part if it is easier for people to understand
> (the main goal of documentation).

This sounds like a good idea; thank you, Dmitry, for pointing this out.

Andrea


>
> Best Regards,
> Elena.