Re: [RFC][PATCH v3]: documentation,atomic: Add new documents

From: Paul E. McKenney
Date: Tue Aug 01 2017 - 12:17:01 EST

Next message: Heiko Stuebner: "[PATCH 2/2] clk: rockchip: add special approximation to fix up fractional clk's jitter"
Previous message: Thiago Jung Bauermann: "Re: [RFC v6 21/62] powerpc: introduce execute-only pkey"
In reply to: Peter Zijlstra: "Re: [RFC][PATCH v3]: documentation,atomic: Add new documents"
Next in thread: Peter Zijlstra: "Re: [RFC][PATCH v3]: documentation,atomic: Add new documents"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Aug 01, 2017 at 01:17:13PM +0100, Will Deacon wrote:
> On Tue, Aug 01, 2017 at 01:47:44PM +0200, Peter Zijlstra wrote:
> > On Tue, Aug 01, 2017 at 11:19:00AM +0100, Will Deacon wrote:
> > > On Tue, Aug 01, 2017 at 11:01:21AM +0200, Peter Zijlstra wrote:
> > > > On Mon, Jul 31, 2017 at 10:43:45AM -0700, Paul E. McKenney wrote:
> > > >
> > > > > Why wouldn't the following have ACQUIRE semantics?
> > > > >
> > > > > atomic_inc(&var);
> > > > > smp_mb__after_atomic();
> > > > >
> > > > > Is the issue that there is no actual value returned or some such?
> > > >
> > > > Yes, so that the inc is a load-store, and thus there is a load, we loose
> > > > the value.
> > > >
> > > > But I see your point I think. Irrespective of still having the value,
> > > > the ordering is preserved and nothing should pass across that.
> > > >
> > > > > So if I have something like this, the assertion really can trigger?
> > > > >
> > > > > WRITE_ONCE(x, 1); atomic_inc(&y);
> > > > > r0 = xchg_release(&y, 5); smp_mb__after_atomic();
> > > > > r1 = READ_ONCE(x);
> > > > >
> > > > >
> > > > > WARN_ON(r0 == 0 && r1 == 0);
> > > > >
> > > > > I must confess that I am not seeing why we would want to allow this
> > > > > outcome.
> > > >
> > > > No you are indeed quite right. I just wasn't creative enough. Thanks for
> > > > the inspiration.
> > >
> > > Just to close this out, we agree that an smp_rmb() instead of
> > > smp_mb__after_atomic() would *not* forbid this outcome, right?
> >
> > So that really hurts my brain. Per the normal rules that smp_rmb() would
> > order the read of @x against the last ll of @y and per ll/sc ordering
> > you then still don't get to make the WARN happen.
> >
> > On IRC you explained that your 8.1 LSE instructions are not in fact
> > ordered by a smp_rmb, only by smp_wmb, which is 'surprising' since you
> > really need to load the old value to compute the new value.
>
> To be clear, it's only the ST* variants of the LSE instructions that are
> treated as a write for the purposes of memory ordering, so these are the
> non-*_return variants. It's not unlikely that other architectures will
> exhibit the same behaviour (e.g. Power, RISC-V), because the CPU can
> treat non-return atomics as "fire-and-forget" and have them handled
> elsewhere in the memory subsystem, causing them to be treated similarly
> to posted writes.
>
> For the code snippet above, the second thread has no idea about the value
> of y and so smp_rmb() is the wrong thing to be using imo. It really cares
> about ordering the store to y before the read of x, so needs a full mb (i.e.
> the test is more like 'R' than 'MP').
>
> Also, wouldn't this problem also arise if your atomics were built using a
> spinlock where unlock had release semantics?

The current Linux kernel memory model forbids this outcome with smp_rmb(),
though I did have to work around the current lack of atomic_inc() using
xchg_relaxed(), so please review my litmus tests carefully.

C C-WillDeacon-MP+o-r+ai-rmb-o.litmus

(*
* Expected result: Never.
*
* Desired litmus test, with atomic_inc() emulated by xchg_relaxed():
*
* WRITE_ONCE(x, 1); atomic_inc(&y);
* r0 = xchg_release(&y, 5); smp_rmb();
* r1 = READ_ONCE(x);
*
*
* WARN_ON(r0 == 0 && r1 == 0);
*)

{
}

P0(int *x, int *y)
{
WRITE_ONCE(*x, 1);
r0 = xchg_release(y, 5);
}

P1(int *x, int *y)
{
r2 = xchg_relaxed(y, 1);
smp_rmb();
r1 = READ_ONCE(*x);
}

exists
(0:r0=0 /\ 1:r1=0)

Here is what herd thinks:

$ herd7 -bell strong-kernel.bell -cat weak-kernel.cat -macros linux.def ../litmus/manual/kernel/C-WillDeacon-MP+o-r+ai-rmb-o.litmus
Test C-WillDeacon-MP+o-r+ai-rmb-o Allowed
States 3
0:r0=0; 1:r1=1;
0:r0=1; 1:r1=0;
0:r0=1; 1:r1=1;
No
Witnesses
Positive: 0 Negative: 3
Condition exists (0:r0=0 /\ 1:r1=0)
Observation C-WillDeacon-MP+o-r+ai-rmb-o Never 0 3
Hash=0c3e25a94b38708a2c5ea11ff52c8077

I get the same answer from strong-kernel.cat (which is our best-guess
envelope over hardware guarantees), weak-kernel.cat (which is simplified
based on what people actually use), and proposal.cat (which is a candidate
model with further simplifications).

I converted this (possibly incorrectly) to PowerPC assembly:

PPC w-RMWl-r+w-RMWl-r.litmus
""
(*
* Does 3.0 Linux-kernel Power atomic_add_return() provide local
* barrier that orders prior stores against subsequent loads?
* Use the atomic_add_return() in both threads, but to different variables.
* And use the trailing-lwsync variant of atomic_add_return().
*)
(* 24-Aug-2011: ppcmem says "Sometimes" *)
{
0:r1=1; 0:r2=x; 0:r3=5; 0:r4=y; 0:r10=0 ; 0:r11=0;
1:r1=1; 1:r2=x; 1:r3=5; 1:r4=y; 1:r10=0 ; 1:r11=0;
}
P0 | P1 ;
stw r1,0(r2) | lwarx r11,r10,r4 ;
lwsync | stwcx. r1,r10,r4 ;
lwarx r11,r10,r4 | bne Fail1 ;
stwcx. r3,r10,r4 | lwsync ;
bne Fail0 | lwz r3,0(r2) ;
li r3,42 | Fail1: ;
Fail0: | ;

exists
(0:r11=0 /\ 0:r3=42 /\ 1:r3=0)

And ppcmem agrees with the linux-kernel memory model:

[ . . . ]

Found 82 : Prune count= 13946 seen_succs= 7453 7454 states
Found 83 : Prune count= 13997 seen_succs= 7490 7491 states
Found 84 : Prune count= 14007 seen_succs= 7506 7507 states
Found 85 : Prune count= 17229 seen_succs= 8889 8890 states
Found 86 : Prune count= 17235 seen_succs= 8897 8898 states
Test w-RMWl-r+w-RMWl-r Allowed
States 9
0:r3=5; 0:r11=0; 1:r3=0;
0:r3=5; 0:r11=0; 1:r3=1;
0:r3=5; 0:r11=0; 1:r3=5;
0:r3=5; 0:r11=1; 1:r3=0;
0:r3=5; 0:r11=1; 1:r3=1;
0:r3=42; 0:r11=0; 1:r3=1;
0:r3=42; 0:r11=0; 1:r3=5;
0:r3=42; 0:r11=1; 1:r3=0;
0:r3=42; 0:r11=1; 1:r3=1;
No (allowed not found)
Condition exists (0:r11=0 /\ 0:r3=42 /\ 1:r3=0)
Hash=58fb07516ac5697580c33e06a354f667
Observation w-RMWl-r+w-RMWl-r Never 0 9

So if ARM really needs the litmus test with smp_rmb() to be allowed,
we need to adjust the Linux-kernel memory model appropriately. Which
means that one of us needs to reach out to the usual suspects. Would
you like to do that, or would you like me to?

Thanx, Paul

Next message: Heiko Stuebner: "[PATCH 2/2] clk: rockchip: add special approximation to fix up fractional clk's jitter"
Previous message: Thiago Jung Bauermann: "Re: [RFC v6 21/62] powerpc: introduce execute-only pkey"
In reply to: Peter Zijlstra: "Re: [RFC][PATCH v3]: documentation,atomic: Add new documents"
Next in thread: Peter Zijlstra: "Re: [RFC][PATCH v3]: documentation,atomic: Add new documents"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]