Re: [PATCH v4 5/5] docs: Update RCU's hotplug requirements with a bit about design
From: Joel Fernandes
Date: Mon Aug 10 2020 - 13:41:55 EST
On Fri, Aug 07, 2020 at 07:10:38PM -0700, Randy Dunlap wrote:
> Hi--
>
> On 8/7/20 10:07 AM, Joel Fernandes (Google) wrote:
> > RCU's hotplug design will help understand the requirements an RCU
> > implementation needs to fullfill, such as dead-lock avoidance.
> >
> > The rcu_barrier() section of the "Hotplug CPU" section already talks
> > about deadlocks, however the description of what else can deadlock other
> > than rcu_barrier is rather incomplete.
> >
> > This commit therefore continues the section by describing how RCU's
> > design handles CPU hotplug in a deadlock-free way.
> >
> > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
Thank you Randy for pointing these out. I will fix the nits in the next revision.
Regards,
- Joel
> > .../RCU/Design/Requirements/Requirements.rst | 22 +++++++++++++++++++
> > 1 file changed, 22 insertions(+)
> >
> > diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
> > index 16c64a2eff93..0a4148b9f743 100644
> > --- a/Documentation/RCU/Design/Requirements/Requirements.rst
> > +++ b/Documentation/RCU/Design/Requirements/Requirements.rst
> > @@ -1940,6 +1940,28 @@ deadlock. Furthermore, ``rcu_barrier()`` blocks CPU-hotplug operations
> > during its execution, which results in another type of deadlock when
> > invoked from a CPU-hotplug notifier.
> >
> > +Also, RCU's implementation avoids serious deadlocks which could occur due to
> > +interaction between hotplug, timers and grace period processing. It does so by
> > +maintaining its own bookkeeping of every CPU's hotplug state, independent of
> > +the various CPU masks and by reporting quiescent states at explicit points. It
> > +may come across as a surprise, but the force quiescent state loop (FQS) does
> > +not report quiescent states for offline CPUs and is not required to.
> > +
> > +For an offline CPU, the quiescent state will be reported in either of:
> > +1. During CPU offlining, using RCU's hotplug notifier (``rcu_report_dead()``).
>
> note, uses (), which is good: ()
>
> > +2. During grace period initialization (``rcu_gp_init``) if it detected a race
>
> add for consistency & readability: rcu_gp_init()
>
> > + with CPU offlining, or a race with a task unblocking on a node which
> > + previously had all of its CPUs offlined.
> > +
> > +The CPU onlining path (``rcu_cpu_starting``) does not need to a report
>
> ditto: rcu_cpu_starting()
>
> > +quiescent state for an offline CPU in fact it would trigger a warning if a
>
> Missing something; maybe like so:
>
> for an offline CPU; in fact
>
> > +quiescent state was not already reported for that CPU.
> > +
> > +During the checking/modification of RCU's hotplug bookkeeping, the
> > +corresponding CPU's leaf node lock is held. This avoids race conditions between
> > +RCU's hotplug notifier hooks, grace period initialization code and the FQS loop
> > +which can concurrently refer to or modify the bookkeeping.
> > +
> > Scheduler and RCU
> > ~~~~~~~~~~~~~~~~~
> >
> >
>
> cheers.
> --
> ~Randy
>