Re: [RFC][PATCH 0/5] arch: atomic rework

From: Torvald Riegel
Date: Fri Feb 07 2014 - 13:44:08 EST


On Fri, 2014-02-07 at 18:06 +0100, Peter Zijlstra wrote:
> On Fri, Feb 07, 2014 at 04:55:48PM +0000, Will Deacon wrote:
> > Hi Paul,
> >
> > On Fri, Feb 07, 2014 at 04:50:28PM +0000, Paul E. McKenney wrote:
> > > On Fri, Feb 07, 2014 at 08:44:05AM +0100, Peter Zijlstra wrote:
> > > > On Thu, Feb 06, 2014 at 08:20:51PM -0800, Paul E. McKenney wrote:
> > > > > Hopefully some discussion of out-of-thin-air values as well.
> > > >
> > > > Yes, absolutely shoot store speculation in the head already. Then drive
> > > > a wooden stake through its hart.
> > > >
> > > > C11/C++11 should not be allowed to claim itself a memory model until that
> > > > is sorted.
> > >
> > > There actually is a proposal being put forward, but it might not make ARM
> > > and Power people happy because it involves adding a compare, a branch,
> > > and an ISB/isync after every relaxed load... Me, I agree with you,
> > > much preferring the no-store-speculation approach.
> >
> > Can you elaborate a bit on this please? We don't permit speculative stores
> > in the ARM architecture, so it seems counter-intuitive that GCC needs to
> > emit any additional instructions to prevent that from happening.
> >
> > Stores can, of course, be observed out-of-order but that's a lot more
> > reasonable :)
>
> This is more about the compiler speculating on stores; imagine:
>
> if (x)
> y = 1;
> else
> y = 2;
>
> The compiler is allowed to change that into:
>
> y = 2;
> if (x)
> y = 1;

If you write the example like that, this is indeed allowed because it's
all sequential code (and there's no volatiles in there, at least you
didn't show them :). A store to y would happen in either case. You
cannot observe the difference between both examples in a data-race-free
program.

Are there supposed to be atomic/non-sequential accesses in there? If
so, please update the example.

> Which is of course a big problem when you want to rely on the ordering.
>
> There's further problems where things like memset() can write outside
> the specified address range. Examples are memset() using single
> instructions to wipe entire cachelines and then 'restoring' the tail
> bit.

As Joseph said, this would be a bug IMO.

> While valid for single threaded, its a complete disaster for concurrent
> code.
>
> There's more, but it all boils down to doing stores you don't expect in
> a 'sane' concurrent environment and/or don't respect the control flow.

A few of those got fixed already, because they violated the memory
model's requirements. If you have further examples that are valid code
in the C11/C++11 model, please report them.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/