Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()
From: Akira Yokosawa
Date: Thu Jun 10 2021 - 20:59:57 EST
On Thu, 10 Jun 2021 17:48:13 -0700, Paul E. McKenney wrote:
> On Fri, Jun 11, 2021 at 09:28:10AM +0900, Akira Yokosawa wrote:
>> On Thu, 10 Jun 2021 09:57:10 -0700, Paul E. McKenney wrote:
>>> On Thu, Jun 10, 2021 at 05:50:29PM +0200, Frederic Weisbecker wrote:
>>>> Add some missing critical pieces of explanation to understand the need
>>>> for full memory barriers throughout the whole grace period state machine,
>>>> thanks to Paul's explanations.
>>>>
>>>> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
>>>> Cc: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>
>>>> Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
>>>> Cc: Uladzislau Rezki <urezki@xxxxxxxxx>
>>>> Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
>>>
>>> Nice!!! And not bad wording either, though I still could not resist the
>>> urge to wordsmith further. Plus I combined your two examples, in order to
>>> provide a trivial example use of the polling interfaces, if nothing else.
>>>
>>> Please let me know if I messed anything up.
>>
>> Hi Paul,
>>
>> See minor tweaks below to satisfy sphinx.
>>
>>>
>>> Thanx, Paul
>>>
>>> ------------------------------------------------------------------------
>>>
>>> commit f21b8fbdf9a59553da825265e92cedb639b4ba3c
>>> Author: Frederic Weisbecker <frederic@xxxxxxxxxx>
>>> Date: Thu Jun 10 17:50:29 2021 +0200
>>>
>>> rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()
>>>
>>> Add some missing critical pieces of explanation to understand the need
>>> for full memory barriers throughout the whole grace period state machine,
>>> thanks to Paul's explanations.
>>>
>>> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
>>> Cc: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>
>>> Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
>>> Cc: Uladzislau Rezki <urezki@xxxxxxxxx>
>>> Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
>>> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
>>>
>>> diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
>>> index 11cdab037bff..3cd5cb4d86e5 100644
>>> --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
>>> +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
>>> @@ -112,6 +112,35 @@ on PowerPC.
>>> The ``smp_mb__after_unlock_lock()`` invocations prevent this
>>> ``WARN_ON()`` from triggering.
>>>
>>> ++-----------------------------------------------------------------------+
>>> +| **Quick Quiz**: |
>>> ++-----------------------------------------------------------------------+
>>> +| But the whole chain of rcu_node-structure locking guarantees that |
>>> +| readers see all pre-grace-period accesses from the updater and |
>>> +| also guarantees that the updater to see all post-grace-period |
>>> +| accesses from the readers. So why do we need all of those calls |
>>> +| to smp_mb__after_unlock_lock()? |
>>> ++-----------------------------------------------------------------------+
>>> +| **Answer**: |
>>> ++-----------------------------------------------------------------------+
>>> +| Because we must provide ordering for RCU's polling grace-period |
>>> +| primitives, for example, get_state_synchronize_rcu() and |
>>> +| poll_state_synchronize_rcu(). For example: |
>>> +| |
>>> +| CPU 0 CPU 1 |
>>> +| ---- ---- |
>>> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
>>> +| g = get_state_synchronize_rcu() smp_mb() |
>>> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
>>> +| continue; |
>>
>> This indent causes warnings from sphinx:
>>
>> Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:135: WARNING: Unexpected indentation.
>> Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:137: WARNING: Block quote ends without a blank line; unexpected unindent
>>
>>> +| r0 = READ_ONCE(Y) |
>>> +| |
>>> +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
>>> +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
>>> +| or offline) and thus won't interact directly with the RCU core |
>>> +| processing at all. |
>>> ++-----------------------------------------------------------------------+
>>> +
>>> This approach must be extended to include idle CPUs, which need
>>> RCU's grace-period memory ordering guarantee to extend to any
>>> RCU read-side critical sections preceding and following the current
>>
>> The code block in the answer can be fixed as follows:
>>
>> ++-----------------------------------------------------------------------+
>> +| **Answer**: |
>> ++-----------------------------------------------------------------------+
>> +| Because we must provide ordering for RCU's polling grace-period |
>> +| primitives, for example, get_state_synchronize_rcu() and |
>> +| poll_state_synchronize_rcu(). For example:: |
>> +| |
>> +| CPU 0 CPU 1 |
>> +| ---- ---- |
>> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
>> +| g = get_state_synchronize_rcu() smp_mb() |
>> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
>> +| continue; |
>> +| r0 = READ_ONCE(Y) |
>> +| |
>> +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
>> +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
>> +| or offline) and thus won't interact directly with the RCU core |
>> +| processing at all. |
>> ++-----------------------------------------------------------------------+
>>
>> Hint: Use of "::" and indented code block.
>
> Thank you!
>
> As in with the following patch to be merged into Frederic's original,
> with attribution?
Sounds good to me!
Thanks, Akira
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> index 3cd5cb4d86e5..bc884ebf88bb 100644
> --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> @@ -125,15 +125,15 @@ The ``smp_mb__after_unlock_lock()`` invocations prevent this
> +-----------------------------------------------------------------------+
> | Because we must provide ordering for RCU's polling grace-period |
> | primitives, for example, get_state_synchronize_rcu() and |
> -| poll_state_synchronize_rcu(). For example: |
> +| poll_state_synchronize_rcu(). For example:: |
> | |
> -| CPU 0 CPU 1 |
> -| ---- ---- |
> -| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> -| g = get_state_synchronize_rcu() smp_mb() |
> -| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> -| continue; |
> -| r0 = READ_ONCE(Y) |
> +| CPU 0 CPU 1 |
> +| ---- ---- |
> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> +| g = get_state_synchronize_rcu() smp_mb() |
> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> +| continue; |
> +| r0 = READ_ONCE(Y) |
> | |
> | RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
> | happen, even if CPU 1 is in an RCU extended quiescent state (idle |
>