Re: [PATCH v4 5/5] docs: Update RCU's hotplug requirements with a bit about design

From: Randy Dunlap
Date: Fri Aug 07 2020 - 22:10:55 EST


Hi--

On 8/7/20 10:07 AM, Joel Fernandes (Google) wrote:
> RCU's hotplug design will help understand the requirements an RCU
> implementation needs to fullfill, such as dead-lock avoidance.
>
> The rcu_barrier() section of the "Hotplug CPU" section already talks
> about deadlocks, however the description of what else can deadlock other
> than rcu_barrier is rather incomplete.
>
> This commit therefore continues the section by describing how RCU's
> design handles CPU hotplug in a deadlock-free way.
>
> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> ---
> .../RCU/Design/Requirements/Requirements.rst | 22 +++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
> index 16c64a2eff93..0a4148b9f743 100644
> --- a/Documentation/RCU/Design/Requirements/Requirements.rst
> +++ b/Documentation/RCU/Design/Requirements/Requirements.rst
> @@ -1940,6 +1940,28 @@ deadlock. Furthermore, ``rcu_barrier()`` blocks CPU-hotplug operations
> during its execution, which results in another type of deadlock when
> invoked from a CPU-hotplug notifier.
>
> +Also, RCU's implementation avoids serious deadlocks which could occur due to
> +interaction between hotplug, timers and grace period processing. It does so by
> +maintaining its own bookkeeping of every CPU's hotplug state, independent of
> +the various CPU masks and by reporting quiescent states at explicit points. It
> +may come across as a surprise, but the force quiescent state loop (FQS) does
> +not report quiescent states for offline CPUs and is not required to.
> +
> +For an offline CPU, the quiescent state will be reported in either of:
> +1. During CPU offlining, using RCU's hotplug notifier (``rcu_report_dead()``).

note, uses (), which is good: ()

> +2. During grace period initialization (``rcu_gp_init``) if it detected a race

add for consistency & readability: rcu_gp_init()

> + with CPU offlining, or a race with a task unblocking on a node which
> + previously had all of its CPUs offlined.
> +
> +The CPU onlining path (``rcu_cpu_starting``) does not need to a report

ditto: rcu_cpu_starting()

> +quiescent state for an offline CPU in fact it would trigger a warning if a

Missing something; maybe like so:

for an offline CPU; in fact

> +quiescent state was not already reported for that CPU.
> +
> +During the checking/modification of RCU's hotplug bookkeeping, the
> +corresponding CPU's leaf node lock is held. This avoids race conditions between
> +RCU's hotplug notifier hooks, grace period initialization code and the FQS loop
> +which can concurrently refer to or modify the bookkeeping.
> +
> Scheduler and RCU
> ~~~~~~~~~~~~~~~~~
>
>

cheers.
--
~Randy