Re: [RFC][PATCH v3]: documentation,atomic: Add new documents

From: Will Deacon
Date: Tue Aug 01 2017 - 08:17:21 EST


On Tue, Aug 01, 2017 at 01:47:44PM +0200, Peter Zijlstra wrote:
> On Tue, Aug 01, 2017 at 11:19:00AM +0100, Will Deacon wrote:
> > On Tue, Aug 01, 2017 at 11:01:21AM +0200, Peter Zijlstra wrote:
> > > On Mon, Jul 31, 2017 at 10:43:45AM -0700, Paul E. McKenney wrote:
> > >
> > > > Why wouldn't the following have ACQUIRE semantics?
> > > >
> > > > atomic_inc(&var);
> > > > smp_mb__after_atomic();
> > > >
> > > > Is the issue that there is no actual value returned or some such?
> > >
> > > Yes, so that the inc is a load-store, and thus there is a load, we loose
> > > the value.
> > >
> > > But I see your point I think. Irrespective of still having the value,
> > > the ordering is preserved and nothing should pass across that.
> > >
> > > > So if I have something like this, the assertion really can trigger?
> > > >
> > > > WRITE_ONCE(x, 1); atomic_inc(&y);
> > > > r0 = xchg_release(&y, 5); smp_mb__after_atomic();
> > > > r1 = READ_ONCE(x);
> > > >
> > > >
> > > > WARN_ON(r0 == 0 && r1 == 0);
> > > >
> > > > I must confess that I am not seeing why we would want to allow this
> > > > outcome.
> > >
> > > No you are indeed quite right. I just wasn't creative enough. Thanks for
> > > the inspiration.
> >
> > Just to close this out, we agree that an smp_rmb() instead of
> > smp_mb__after_atomic() would *not* forbid this outcome, right?
>
> So that really hurts my brain. Per the normal rules that smp_rmb() would
> order the read of @x against the last ll of @y and per ll/sc ordering
> you then still don't get to make the WARN happen.
>
> On IRC you explained that your 8.1 LSE instructions are not in fact
> ordered by a smp_rmb, only by smp_wmb, which is 'surprising' since you
> really need to load the old value to compute the new value.

To be clear, it's only the ST* variants of the LSE instructions that are
treated as a write for the purposes of memory ordering, so these are the
non-*_return variants. It's not unlikely that other architectures will
exhibit the same behaviour (e.g. Power, RISC-V), because the CPU can
treat non-return atomics as "fire-and-forget" and have them handled
elsewhere in the memory subsystem, causing them to be treated similarly
to posted writes.

For the code snippet above, the second thread has no idea about the value
of y and so smp_rmb() is the wrong thing to be using imo. It really cares
about ordering the store to y before the read of x, so needs a full mb (i.e.
the test is more like 'R' than 'MP').

Also, wouldn't this problem also arise if your atomics were built using a
spinlock where unlock had release semantics?

Will