Re: [PATCH] documentation: Fix two-CPU control-dependency example

From: Akira Yokosawa
Date: Sun Jul 23 2017 - 20:06:04 EST


On 2017/07/23 23:39:36 +0800, Boqun Feng wrote:
> On Sat, Jul 22, 2017 at 09:43:00PM -0700, Paul E. McKenney wrote:
> [...]
>>> Your priority seemed to be in reducing the chance of the "if" statement
>>> to be optimized away. So I suggested to use "extern" as a compromise.
>>
>
> Hi Akira,
>

Hi Boqun,

> The problem is that, such a compromise doesn't help *developers* write
> good concurrent code. The document should serve as a reference book for
> the developers, and with the compromise you suggest, the developers will
> possibly add "extern" to their shared variables. This is not only
> unrealistic but also wrong, because "extern" means external for
> translation units(compiling units), not external for execution
> units(CPUs).

Yes, I suggested it regarding the situation when the tiny litmus test
is compiled in a translation unit. Also it might not be effective once
link time optimization becomes "smart" enough.

And I agree it was not appropriate for memory-barriers.txt.

>
> And as I said, the proper semantics of READ_ONCE() should work well
> without using "extern", if we find a 'volatile' load doesn't work, we
> can find another way (writing in asm or use asm volatile("" : "+m"(var));
> to indicate @var changed). And the compromise just changes the
> semantics... To me, it's not worth changing the semantics because the
> implementation might be broken in the feature ;-)

I agree.

>
>
>> If the various tools accept the "extern", this might not be a bad thing
>> to do.
>>
>> But what this really means is that I need to take another tilt at
>> the "volatile" windmill in the committee.
>>
>>> Another way would be to express the ">=" version in a pseudo-asm form.
>>>
>>> CPU 0 CPU 1
>>> ======================= =======================
>>> r1 = LOAD x r2 = LOAD y
>>> if (r1 >= 0) if (r2 >= 0)
>>> STORE y = 1 STORE x = 1
>>>
>>> assert(!(r1 == 1 && r2 == 1));
>>>
>>> This should eliminate any concern of compiler optimization.
>>> In this final part of CONTROL DEPENDENCIES section, separating the
>>> problem of optimization and transitivity would clarify the point
>>> (at least for me).
>>
>> The problem is that people really do use C-language control dependencies
>> in the Linux kernel, so we need to describe them. Maybe someday it
>> will be necessary to convert them to asm, but I am hoping that we can
>> avoid that.
>>
>>> Thoughts?
>>
>> My hope is that the memory model can help here, but that will in any
>> case take time.
>
> Hi Paul,
>
> I add some comments for READ_ONCE() to emphasize compilers should honor
> the return value, in the future, we may need a separate document for the
> use/definition of volatile in kernel, but I think the comment of
> READ_ONCE() is good enough now?
>
> Regards,
> Boqun
>
> ----------------->8
> Subject: [PATCH] kernel: Emphasize the return value of READ_ONCE() is honored
>
> READ_ONCE() is used around in kernel to provide a control dependency,
> and to make the control dependency valid, we must 1) make the load of
> READ_ONCE() actually happen and 2) make sure compilers take the return
> value of READ_ONCE() serious. 1) is already done and commented,
> and in current implementation, 2) is also considered done in the
> same way as 1): a 'volatile' load.
>
> Whereas, Akira Yokosawa recently reported a problem that would be
> triggered if 2) is not achieved.

To clarity the timeline, it was Paul who pointed out it would become
easier for compilers to optimize away the "if" statements in response
to my suggestion of partial revert (">" -> ">=").

> Moreover, according to Paul Mckenney,
> using volatile might not actually give us what we want for 2) depending
> on compiler writers' definition of 'volatile'. Therefore it's necessary
> to emphasize 2) as a part of the semantics of READ_ONCE(), this not only
> fits the conceptual semantics we have been using, but also makes the
> implementation requirement more accurate.
>
> In the future, we can either make compiler writers accept our use of
> 'volatile', or(if that fails) find another way to provide this
> guarantee.
>
> Cc: Akira Yokosawa <akiyks@xxxxxxxxx>
> Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> ---
> include/linux/compiler.h | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 219f82f3ec1a..8094f594427c 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -305,6 +305,31 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
> * mutilate accesses that either do not require ordering or that interact
> * with an explicit memory barrier or atomic instruction that provides the
> * required ordering.
> + *
> + * The return value of READ_ONCE() should be honored by compilers, IOW,
> + * compilers must treat the return value of READ_ONCE() as an unknown value at
> + * compile time, i.e. no optimization should be done based on the value of a
> + * READ_ONCE(). For example, the following code snippet:
> + *
> + * int a = 0;
> + * int x = 0;
> + *
> + * void some_func() {
> + * int t = READ_ONCE(a);
> + * if (!t)
> + * WRITE_ONCE(x, 1);
> + * }
> + *
> + * , should never be optimized as:
> + *
> + * void some_func() {
> + * WRITE_ONCE(x, 1);
> + * }
READ_ONCE() should still be honored. so maybe the following?

+ * , should never be optimized as:
+ *
+ * void some_func() {
+ * int t = READ_ONCE(a);
+ * WRITE_ONCE(x, 1);
+ * }

Thanks, Akira

> + *
> + * because the compiler is 'smart' enough to think the value of 'a' is never
> + * changed.
> + *
> + * We provide this guarantee by making READ_ONCE() a *volatile* load.
> */
>
> #define __READ_ONCE(x, check) \
>