Re: [PATCH] documentation: Fix two-CPU control-dependency example
From: Paul E. McKenney
Date: Thu Jul 20 2017 - 17:42:43 EST
On Fri, Jul 21, 2017 at 06:12:56AM +0900, Akira Yokosawa wrote:
> On 2017/07/20 09:11:52AM -0700, Paul E. McKenney wrote:
> > On Thu, Jul 20, 2017 at 09:55:31PM +0900, Akira Yokosawa wrote:
> >> On 2017/07/20 14:47, Paul E. McKenney wrote:
> >>> On Thu, Jul 20, 2017 at 09:31:41AM +0800, Boqun Feng wrote:
> >>>> On Wed, Jul 19, 2017 at 02:56:02PM -0700, Paul E. McKenney wrote:
> >>>>> On Thu, Jul 20, 2017 at 06:33:26AM +0900, Akira Yokosawa wrote:
> >>>>>> On 2017/07/20 2:43, Paul E. McKenney wrote:
> >>>>>>> On Mon, Jul 17, 2017 at 05:24:42PM +0900, Akira Yokosawa wrote:
> >>>>>>>> >From b798b9b631e237d285aa8699da00bfb8ced33bea Mon Sep 17 00:00:00 2001
> >>>>>>>> From: Akira Yokosawa <akiyks@xxxxxxxxx>
> >>>>>>>> Date: Mon, 17 Jul 2017 16:25:33 +0900
> >>>>>>>> Subject: [PATCH] documentation: Fix two-CPU control-dependency example
> >>>>>>>>
> >>>>>>>> In commit 5646f7acc95f ("memory-barriers: Fix control-ordering
> >>>>>>>> no-transitivity example"), the operator in "if" statement of
> >>>>>>>> the two-CPU example was modified from ">=" to ">".
> >>>>>>>> Now the example misses the point because there is no party
> >>>>>>>> who will modify "x" nor "y". So each CPU performs only the
> >>>>>>>> READ_ONCE().
> >>>>>>>>
> >>>>>>>> The point of this example is to use control dependency for ordering,
> >>>>>>>> and the WRITE_ONCE() should always be executed.
> >>>>>>>>
> >>>>>>>> So it was correct prior to the above mentioned commit. Partial
> >>>>>>>> revert of the commit (with context adjustments regarding other
> >>>>>>>> changes thereafter) restores the point.
> >>>>>>>>
> >>>>>>>> Note that the three-CPU example demonstrating the lack of
> >>>>>>>> transitivity stands regardless of this partial revert.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx>
> >>>>>>>
> >>>>>>> Hello, Akira,
> >>>>>>>
> >>>>>>> You are quite right that many compilers would generate straightforward
> >>>>>>> code for the code fragment below, and in that case, the assertion could
> >>>>>>> never trigger due to either TSO or control dependencies, depending on
> >>>>>>> the architecture this was running on.
> >>>>>>>
> >>>>>>> However, if the compiler was too smart for our good, it could figure
> >>>>>>> out that "x" and "y" only take on the values zero and one, so that
> >>>>>>> the "if" would always be taken. At that point, the compiler could
> >>>>>>> simply ignore the "if" with the result shown below.
> >>>>>>>
> >>>>>>>> ---
> >>>>>>>> Documentation/memory-barriers.txt | 2 +-
> >>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> >>>>>>>> index c4ddfcd..c1ebe99 100644
> >>>>>>>> --- a/Documentation/memory-barriers.txt
> >>>>>>>> +++ b/Documentation/memory-barriers.txt
> >>>>>>>> @@ -851,7 +851,7 @@ demonstrated by two related examples, with the initial values of
> >>>>>>>> CPU 0 CPU 1
> >>>>>>>> ======================= =======================
> >>>>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
> >>>>>>>> - if (r1 > 0) if (r2 > 0)
> >>>>>>>> + if (r1 >= 0) if (r2 >= 0)
> >>>>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
> >>>>>>>>
> >>>>>>>> assert(!(r1 == 1 && r2 == 1));
> >>>>>>>
> >>>>>>> Original program:
> >>>>>>>
> >>>>>>> CPU 0 CPU 1
> >>>>>>> ======================= =======================
> >>>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
> >>>>>>> if (r1 >= 0) if (r2 >= 0)
> >>>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
> >>>>>>>
> >>>>>>> assert(!(r1 == 1 && r2 == 1));
> >>>>>>>
> >>>>>>> Enthusiastically optimized program:
> >>>>>>>
> >>>>>>> CPU 0 CPU 1
> >>>>>>> ======================= =======================
> >>>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
> >>>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
> >>>>>>>
> >>>>>>> assert(!(r1 == 1 && r2 == 1));
> >>>>>>>
> >>>>>>> This optimized version could trigger the assertion.
> >>>>>>>
> >>>>>>> So we do need to stick with the ">" comparison.
> >>>>>>
> >>>>>> Well but,
> >>>>>>
> >>>>>> Current example:
> >>>>>>
> >>>>>> CPU 0 CPU 1
> >>>>>> ======================= =======================
> >>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
> >>>>>> if (r1 > 0) if (r2 > 0)
> >>>>>> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
> >>>>>>
> >>>>>> assert(!(r1 == 1 && r2 == 1));
> >>>>>>
> >>>>>> Such a clever compiler might be able to prove that "x" and "y"
> >>>>>> are never modified and end up in the following:
> >>>>>>
> >>>>
> >>>> Hi Akira,
> >>>>
> >>>> I wouldn't call that compiler a clever one, it's a broken one ;-)
> >>>>
> >>>> So here is the thing: READ_ONCE() is a *volatile* load, which means the
> >>>> compiler has to generate code that actually does a load, so the values
> >>>> of r1 and r2 depend on the loads, therefore, a sane compiler will not
> >>>> optimize the if()s out because the volatile semantics of READ_ONCE().
> >>>> Otherwise, I think we have a lot more to worry about than this case.
> >>>>
> >>>>>> CPU 0 CPU 1
> >>>>>> ======================= =======================
> >>>>>> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
> >>>>>>
> >>>>>> assert(!(r1 == 1 && r2 == 1));
> >>>>>>
> >>>>>> This means it is impossible to describe this example in C,
> >>>>>> doesn't it?
> >>>>>
> >>>>> That is a matter of some debate, but it has gotten increasingly more
> >>>>> difficult to get C to do one's bidding over the decades.
> >>>>>
> >>>>>> What am I missing here?
> >>>>>
> >>>>> The compiler has to work harder in your example case, so it is probably
> >>>>> just a matter of time. In the original with ">=", all the compiler needed
> >>>>> to do was look at all the assignments and the initial value. In the
> >>>>> original, it also had to do reasoning based on control dependencies
> >>>>> (which are not yet part of the C standard).
> >>>>>
> >>>>> But yes, the degree to which a compiler can optimize atomics is a matter
> >>>>> of some debate. Here is a proposal to let the developer choose:
> >>>>>
> >>>>
> >>>> Hi Paul,
> >>>>
> >>>> I know the compiler could optimize atomics in very interesting ways, but
> >>>> this case is about volatile, so I guess our case is still fine? ;-)
> >>>
> >>> Hello, Boqun,
> >>>
> >>> When I asked that question, the answer I got was "the compiler must
> >>> emit the load instruction, but is under no obligation to actually use the
> >>> value loaded".
> >>
> >> So, it sounds like the following description found in memory-barriers.txt
> >> just above the example of lack of transitivity:
> >>
> >>> However, stores are not speculated. This means that ordering -is- provided
> >>> for load-store control dependencies, as in the following example:
> >>>
> >>> q = READ_ONCE(a);
> >>> if (q) {
> >>> WRITE_ONCE(b, 1);
> >>> }
> >>>
> >>> Control dependencies pair normally with other types of barriers.
> >>> That said, please note that neither READ_ONCE() nor WRITE_ONCE()
> >>> are optional! Without the READ_ONCE(), the compiler might combine the
> >>> load from 'a' with other loads from 'a'. Without the WRITE_ONCE(),
> >>> the compiler might combine the store to 'b' with other stores to 'b'.
> >>> Either can result in highly counterintuitive effects on ordering.
> >>>
> >>> Worse yet, if the compiler is able to prove (say) that the value of
> >>> variable 'a' is always non-zero, it would be well within its rights
> >>> to optimize the original example by eliminating the "if" statement
> >>> as follows:
> >>>
> >>> q = a;
> >>> b = 1; /* BUG: Compiler and CPU can both reorder!!! */
> >>>
> >>> So don't leave out the READ_ONCE().
> >>
> >> does not stand if the answer needs to be taken seriously.
> >> The READ_ONCE() is not good enough to prevent the "if" statement
> >> from being eliminated.
> >>
> >> Hmm... If we do need to care about such an extreme optimization,
> >> we should not rely on any control dependency in C source code
> >> at least until the compiler gets educated about control dependency.
> >>
> >> Is there some reasonable compromise?
> >
> > Right now, what we have are the volatile loads such as from READ_ONCE()
> > and WRITE_ONCE(). My defense of them has been that they need to properly
> > handle MMIO. But not all compiler writers agree with me, so we should
> > program at least a bit defensively. Though we should probably also
> > be writing tests to verify that the compiler is doing the right thing,
> > and those litmus tests should probably include your ">=" litmus test.
> > But we should not encourage developers to rely on it quite yet.
> >
> > My current position is that compiler writers are currently unlikely to
> > make their compilers do deep reasoning around the memory model, and that
> > they are currently unlikely to do cross-translation-unit assigned-value
> > analysis (though LTO might be changing that). But given a static atomic
> > that is visible only within a translation unit, I would expect them
> > to do value-range analysis. Hence my reluctance to present the ">="
> > variant as a pattern to follow.
>
> So if we declare "x" and "y" are not static (not file local),
> can the ">=" variant be exempted from the possible optimization?
>
> For example:
>
> extern int x, y;
>
> CPU 0 CPU 1
> ======================= =======================
> r1 = READ_ONCE(x); r2 = READ_ONCE(y);
> if (r1 >= 0) if (r2 >= 0)
> WRITE_ONCE(y, 1); WRITE_ONCE(x, 1);
>
> assert(!(r1 == 1 && r2 == 1));
>
> This looks safe to me.
For the compilers I know about at the present time, yes.
The underlying problem is that the standard says almost nothing about what
volatile does. I usually argue that it was intended to be used for MMIO,
so any optimization that would prevent a device driver from working must
be prohibited on volatiles. A lot of people really don't like volatile,
and would like to eliminate it, and these people also aren't very happy
about any attempt to better define volatile.
Keeps things entertaining. ;-)
Thanx, Paul
> Thoughts?
>
> Thanks, Akira
>
> >
> > Thanx, Paul
> >
> >
> >> Thanks, Akira
> >>
> >>>
> >>> I don't happen to like that answer, by the way. ;-)
> >>>
> >>> Thanx, Paul
> >>>
> >>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0062r1.html
> >>>>>
> >>>>
> >>>> Great material to wake up mind in the morning! Thanks.
> >>>>
> >>>> Regards,
> >>>> Boqun
> >>>>
> >>>>> What are your thoughts on this?
> >>>>>
> >>>>> Thanx, Paul
> >>>>>
> >>>>>> Thanks, Akira
> >>>>>>
> >>>>>>> That said, I very much welcome critical reviews of memory-barriers.txt,
> >>>>>>> so please do feel free to continue doing that!
> >>>>>>>
> >>>>>>> Thanx, Paul
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> >
>