Re: [RFC][PATCH v3]: documentation,atomic: Add new documents
From: Paul E. McKenney
Date: Wed Aug 02 2017 - 12:18:07 EST
On Wed, Aug 02, 2017 at 10:45:32AM +0100, Will Deacon wrote:
> Hi Paul,
>
> On Tue, Aug 01, 2017 at 09:14:12AM -0700, Paul E. McKenney wrote:
> > On Tue, Aug 01, 2017 at 01:17:13PM +0100, Will Deacon wrote:
> > > On Tue, Aug 01, 2017 at 01:47:44PM +0200, Peter Zijlstra wrote:
> > > > On Tue, Aug 01, 2017 at 11:19:00AM +0100, Will Deacon wrote:
> > > > > On Tue, Aug 01, 2017 at 11:01:21AM +0200, Peter Zijlstra wrote:
> > > > > > On Mon, Jul 31, 2017 at 10:43:45AM -0700, Paul E. McKenney wrote:
> > > > > > > So if I have something like this, the assertion really can trigger?
> > > > > > >
> > > > > > > WRITE_ONCE(x, 1); atomic_inc(&y);
> > > > > > > r0 = xchg_release(&y, 5); smp_mb__after_atomic();
> > > > > > > r1 = READ_ONCE(x);
> > > > > > >
> > > > > > >
> > > > > > > WARN_ON(r0 == 0 && r1 == 0);
> > > > > > >
> > > > > > > I must confess that I am not seeing why we would want to allow this
> > > > > > > outcome.
> > > > > >
> > > > > > No you are indeed quite right. I just wasn't creative enough. Thanks for
> > > > > > the inspiration.
> > > > >
> > > > > Just to close this out, we agree that an smp_rmb() instead of
> > > > > smp_mb__after_atomic() would *not* forbid this outcome, right?
> > > >
> > > > So that really hurts my brain. Per the normal rules that smp_rmb() would
> > > > order the read of @x against the last ll of @y and per ll/sc ordering
> > > > you then still don't get to make the WARN happen.
> > > >
> > > > On IRC you explained that your 8.1 LSE instructions are not in fact
> > > > ordered by a smp_rmb, only by smp_wmb, which is 'surprising' since you
> > > > really need to load the old value to compute the new value.
> > >
> > > To be clear, it's only the ST* variants of the LSE instructions that are
> > > treated as a write for the purposes of memory ordering, so these are the
> > > non-*_return variants. It's not unlikely that other architectures will
> > > exhibit the same behaviour (e.g. Power, RISC-V), because the CPU can
> > > treat non-return atomics as "fire-and-forget" and have them handled
> > > elsewhere in the memory subsystem, causing them to be treated similarly
> > > to posted writes.
> > >
> > > For the code snippet above, the second thread has no idea about the value
> > > of y and so smp_rmb() is the wrong thing to be using imo. It really cares
> > > about ordering the store to y before the read of x, so needs a full mb (i.e.
> > > the test is more like 'R' than 'MP').
> > >
> > > Also, wouldn't this problem also arise if your atomics were built using a
> > > spinlock where unlock had release semantics?
> >
> > The current Linux kernel memory model forbids this outcome with smp_rmb(),
> > though I did have to work around the current lack of atomic_inc() using
> > xchg_relaxed(), so please review my litmus tests carefully.
>
> It's worth noting that we don't have the problem with any value-returning
> atomics, so all flavours of xchg in this test would be forbidden on arm64
> too.
Plus after upgrading to the latest and greatest version of herd,
atomic_inc() worked just fine. (Hey, I -try- to keep up!) The updated
litmus test is here:
https://github.com/paulmckrcu/litmus/blob/master/manual/kernel/C-WillDeacon-MP%2Bo-r%2Bai-rmb-o.litmus
Same outcome. Alan Stern is looking into what might be adjusted.
Of course, there is no guarantee that this will turn out to be reasonable
or for that matter acceptable to the usual suspects, but if feasible we
should at least see what this does to the model.
> > C C-WillDeacon-MP+o-r+ai-rmb-o.litmus
> >
> > (*
> > * Expected result: Never.
> > *
> > * Desired litmus test, with atomic_inc() emulated by xchg_relaxed():
> > *
> > * WRITE_ONCE(x, 1); atomic_inc(&y);
> > * r0 = xchg_release(&y, 5); smp_rmb();
> > * r1 = READ_ONCE(x);
> > *
> > *
> > * WARN_ON(r0 == 0 && r1 == 0);
> > *)
> >
> > {
> > }
> >
> > P0(int *x, int *y)
> > {
> > WRITE_ONCE(*x, 1);
> > r0 = xchg_release(y, 5);
> > }
> >
> > P1(int *x, int *y)
> > {
> > r2 = xchg_relaxed(y, 1);
> > smp_rmb();
> > r1 = READ_ONCE(*x);
> > }
> >
> > exists
> > (0:r0=0 /\ 1:r1=0)
> >
> > Here is what herd thinks:
> >
> > $ herd7 -bell strong-kernel.bell -cat weak-kernel.cat -macros linux.def ../litmus/manual/kernel/C-WillDeacon-MP+o-r+ai-rmb-o.litmus
> > Test C-WillDeacon-MP+o-r+ai-rmb-o Allowed
> > States 3
> > 0:r0=0; 1:r1=1;
> > 0:r0=1; 1:r1=0;
> > 0:r0=1; 1:r1=1;
> > No
> > Witnesses
> > Positive: 0 Negative: 3
> > Condition exists (0:r0=0 /\ 1:r1=0)
> > Observation C-WillDeacon-MP+o-r+ai-rmb-o Never 0 3
> > Hash=0c3e25a94b38708a2c5ea11ff52c8077
> >
> > I get the same answer from strong-kernel.cat (which is our best-guess
> > envelope over hardware guarantees), weak-kernel.cat (which is simplified
> > based on what people actually use), and proposal.cat (which is a candidate
> > model with further simplifications).
> >
> > I converted this (possibly incorrectly) to PowerPC assembly:
> >
> > PPC w-RMWl-r+w-RMWl-r.litmus
> > ""
> > (*
> > * Does 3.0 Linux-kernel Power atomic_add_return() provide local
> > * barrier that orders prior stores against subsequent loads?
> > * Use the atomic_add_return() in both threads, but to different variables.
> > * And use the trailing-lwsync variant of atomic_add_return().
> > *)
> > (* 24-Aug-2011: ppcmem says "Sometimes" *)
> > {
> > 0:r1=1; 0:r2=x; 0:r3=5; 0:r4=y; 0:r10=0 ; 0:r11=0;
> > 1:r1=1; 1:r2=x; 1:r3=5; 1:r4=y; 1:r10=0 ; 1:r11=0;
> > }
> > P0 | P1 ;
> > stw r1,0(r2) | lwarx r11,r10,r4 ;
> > lwsync | stwcx. r1,r10,r4 ;
> > lwarx r11,r10,r4 | bne Fail1 ;
> > stwcx. r3,r10,r4 | lwsync ;
> > bne Fail0 | lwz r3,0(r2) ;
> > li r3,42 | Fail1: ;
> > Fail0: | ;
> >
> >
> > exists
> > (0:r11=0 /\ 0:r3=42 /\ 1:r3=0)
> >
> > And ppcmem agrees with the linux-kernel memory model:
> >
> > [ . . . ]
> >
> > Found 82 : Prune count= 13946 seen_succs= 7453 7454 states
> > Found 83 : Prune count= 13997 seen_succs= 7490 7491 states
> > Found 84 : Prune count= 14007 seen_succs= 7506 7507 states
> > Found 85 : Prune count= 17229 seen_succs= 8889 8890 states
> > Found 86 : Prune count= 17235 seen_succs= 8897 8898 states
> > Test w-RMWl-r+w-RMWl-r Allowed
> > States 9
> > 0:r3=5; 0:r11=0; 1:r3=0;
> > 0:r3=5; 0:r11=0; 1:r3=1;
> > 0:r3=5; 0:r11=0; 1:r3=5;
> > 0:r3=5; 0:r11=1; 1:r3=0;
> > 0:r3=5; 0:r11=1; 1:r3=1;
> > 0:r3=42; 0:r11=0; 1:r3=1;
> > 0:r3=42; 0:r11=0; 1:r3=5;
> > 0:r3=42; 0:r11=1; 1:r3=0;
> > 0:r3=42; 0:r11=1; 1:r3=1;
> > No (allowed not found)
> > Condition exists (0:r11=0 /\ 0:r3=42 /\ 1:r3=0)
> > Hash=58fb07516ac5697580c33e06a354f667
> > Observation w-RMWl-r+w-RMWl-r Never 0 9
> >
> > So if ARM really needs the litmus test with smp_rmb() to be allowed,
> > we need to adjust the Linux-kernel memory model appropriately. Which
> > means that one of us needs to reach out to the usual suspects. Would
> > you like to do that, or would you like me to?
>
> If you don't mind doing it, then that would be great, thanks. Do shout if
> you need me to help with anything, though!
You will be copied, just to cut out the timezone delays if nothing else.
Thanx, Paul