Re: [RFC][PATCH 0/5] arch: atomic rework

From: Paul E. McKenney
Date: Tue Feb 18 2014 - 11:51:34 EST


On Tue, Feb 18, 2014 at 04:56:40PM +0100, Torvald Riegel wrote:
> On Mon, 2014-02-17 at 19:00 -0800, Paul E. McKenney wrote:
> > On Mon, Feb 17, 2014 at 12:18:21PM -0800, Linus Torvalds wrote:
> > > On Mon, Feb 17, 2014 at 11:55 AM, Torvald Riegel <triegel@xxxxxxxxxx> wrote:
> > > >
> > > > Which example do you have in mind here? Haven't we resolved all the
> > > > debated examples, or did I miss any?
> > >
> > > Well, Paul seems to still think that the standard possibly allows
> > > speculative writes or possibly value speculation in ways that break
> > > the hardware-guaranteed orderings.
> >
> > It is not that I know of any specific problems, but rather that I
> > know I haven't looked under all the rocks. Plus my impression from
> > my few years on the committee is that the standard will be pushed to
> > the limit when it comes time to add optimizations.
> >
> > One example that I learned about last week uses the branch-prediction
> > hardware to validate value speculation. And no, I am not at all a fan
> > of value speculation, in case you were curious. However, it is still
> > an educational example.
> >
> > This is where you start:
> >
> > p = gp.load_explicit(memory_order_consume); /* AKA rcu_dereference() */
> > do_something(p->a, p->b, p->c);
> > p->d = 1;
>
> I assume that's the source code.

Yep!

> > Then you leverage branch-prediction hardware as follows:
> >
> > p = gp.load_explicit(memory_order_consume); /* AKA rcu_dereference() */
> > if (p == GUESS) {
> > do_something(GUESS->a, GUESS->b, GUESS->c);
> > GUESS->d = 1;
> > } else {
> > do_something(p->a, p->b, p->c);
> > p->d = 1;
> > }
>
> I assume that this is a potential transformation by a compiler.

Again, yep!

> > The CPU's branch-prediction hardware squashes speculation in the case where
> > the guess was wrong, and this prevents the speculative store to ->d from
> > ever being visible. However, the then-clause breaks dependencies, which
> > means that the loads -could- be speculated, so that do_something() gets
> > passed pre-initialization values.
> >
> > Now, I hope and expect that the wording in the standard about dependency
> > ordering prohibits this sort of thing. But I do not yet know for certain.
>
> The transformation would be incorrect. p->a in the source code carries
> a dependency, and as you say, the transformed code wouldn't have that
> dependency any more. So the transformed code would loose ordering
> constraints that it has in the virtual machine, so in the absence of
> other proofs of correctness based on properties not shown in the
> example, the transformed code would not result in the same behavior as
> allowed by the abstract machine.

Glad that you agree! ;-)

> If the transformation would actually be by a programmer, then this
> wouldn't do the same as the first example because mo_consume doesn't
> work through the if statement.

Agreed.

> Are there other specified concerns that you have regarding this example?

Nope. Just generalized paranoia. (But just because I am paranoid
doesn't mean that there isn't a bug lurking somewhere in the standard,
the compiler, the kernel, or my own head!)

I will likely have more once I start mapping Linux kernel atomics to the
C11 standard. One more paper past N3934 comes first, though.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/