Re: perf events ring buffer memory barrier on powerpc

From: Paul E. McKenney
Date: Fri Nov 01 2013 - 05:28:27 EST


On Thu, Oct 31, 2013 at 04:19:55PM +0100, Peter Zijlstra wrote:
> On Thu, Oct 31, 2013 at 08:07:56AM -0700, Paul E. McKenney wrote:
> > On Thu, Oct 31, 2013 at 10:04:57AM +0100, Peter Zijlstra wrote:
> > > On Wed, Oct 30, 2013 at 09:32:58PM -0700, Paul E. McKenney wrote:
> > > > Before C/C++11, the closest thing to such a prohibition is use of
> > > > volatile, for example, ACCESS_ONCE(). Even in C/C++11, you have to
> > > > use atomics to get anything resembing this prohibition.
> > > >
> > > > If you just use normal variables, the compiler is within its rights
> > > > to transform something like the following:
> > > >
> > > > if (a)
> > > > b = 1;
> > > > else
> > > > b = 42;
> > > >
> > > > Into:
> > > >
> > > > b = 42;
> > > > if (a)
> > > > b = 1;
> > > >
> > > > Many other similar transformations are permitted. Some are used to all
> > > > vector instructions to be used -- the compiler can do a write with an
> > > > overly wide vector instruction, then clean up the clobbered variables
> > > > later, if it wishes. Again, if the variables are not marked volatile,
> > > > or, in C/C++11, atomic.
> > >
> > > While I've heard you tell this story before, my mind keeps boggling how
> > > we've been able to use shared memory at all, all these years.
> > >
> > > It seems to me stuff should have broken left, right and center if
> > > compilers were really aggressive about this.
> >
> > Sometimes having stupid compilers is a good thing. But they really are
> > getting more aggressive.
>
> But surely we cannot go mark all data structures lodged in shared memory
> as volatile, that's insane.
>
> I'm sure you're quite worried about this as well. Suppose we have:
>
> struct foo {
> unsigned long value;
> void *ptr;
> unsigned long value1;
> };
>
> And our ptr member is RCU managed. Then while the assignment using:
> rcu_assign_ptr() will use the volatile cast, what stops the compiler
> from wrecking ptr while writing either of the value* members and
> 'fixing' her up after?

Nothing at all!

We can reduce the probability by putting the pointer at one end or the
other, so that if the compiler uses (say) vector instructions to aggregate
individual assignments to the other fields, it will be less likely to hit
"ptr". But yes, this is ugly and it would be really hard to get all
this right, and would often conflict with cache-locality needs.

> This is a completely untenable position.

Indeed it is!

C/C++ never was intended to be used for parallel programming, and this is
but one of the problems that can arise when we nevertheless use it for
parallel programming. As compilers get smarter (for some definition of
"smarter") and as more systems have special-purpose hardware (such as
vector units) that are visible to the compiler, we can expect more of
this kind of trouble.

This was one of many reasons that I decided to help with the C/C++11
effort, whatever anyone might think about the results.

> How do the C/C++ people propose to deal with this?

By marking "ptr" as atomic, thus telling the compiler not to mess with it.
And thus requiring that all accesses to it be decorated, which in the
case of RCU could be buried in the RCU accessors.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/