Re: [PATCH v2 tip/core/rcu 01/39] rcu: Maintain special bits at bottom of ->dynticks counter

From: Paul E. McKenney
Date: Tue Apr 18 2017 - 14:19:53 EST


On Mon, Apr 17, 2017 at 05:07:55PM -0700, Josh Triplett wrote:
> On Mon, Apr 17, 2017 at 04:44:48PM -0700, Paul E. McKenney wrote:
> > Currently, IPIs are used to force other CPUs to invalidate their TLBs
> > in response to a kernel virtual-memory mapping change. This works, but
> > degrades both battery lifetime (for idle CPUs) and real-time response
> > (for nohz_full CPUs), and in addition results in unnecessary IPIs due to
> > the fact that CPUs executing in usermode are unaffected by stale kernel
> > mappings. It would be better to cause a CPU executing in usermode to
> > wait until it is entering kernel mode to do the flush, first to avoid
> > interrupting usemode tasks and second to handle multiple flush requests
> > with a single flush in the case of a long-running user task.
> >
> > This commit therefore reserves a bit at the bottom of the ->dynticks
> > counter, which is checked upon exit from extended quiescent states.
> > If it is set, it is cleared and then a new rcu_eqs_special_exit() macro is
> > invoked, which, if not supplied, is an empty single-pass do-while loop.
> > If this bottom bit is set on -entry- to an extended quiescent state,
> > then a WARN_ON_ONCE() triggers.
> >
> > This bottom bit may be set using a new rcu_eqs_special_set() function,
> > which returns true if the bit was set, or false if the CPU turned
> > out to not be in an extended quiescent state. Please note that this
> > function refuses to set the bit for a non-nohz_full CPU when that CPU
> > is executing in usermode because usermode execution is tracked by RCU
> > as a dyntick-idle extended quiescent state only for nohz_full CPUs.
> >
> > Reported-by: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
>
> Comments below. With those fixed:
> Reviewed-by: Josh Triplett <josh@xxxxxxxxxxxxxxxx>

Good eyes, fixed!

Thanx, Paul

> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -290,15 +300,20 @@ static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
> > static void rcu_dynticks_eqs_enter(void)
> > {
> > struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> > - int special;
> > + int seq;
> >
> > /*
> > * CPUs seeing atomic_inc_return() must see prior RCU read-side
> > * critical sections, and we also must force ordering with the
> > * next idle sojourn.
> > */
> > - special = atomic_inc_return(&rdtp->dynticks);
> > - WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && special & 0x1);
> > + seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks);
>
> You changed atomic_inc_return to atomic_add_return here, but the comment
> above still says atomic_inc_return.
>
> > @@ -308,15 +323,22 @@ static void rcu_dynticks_eqs_enter(void)
> > static void rcu_dynticks_eqs_exit(void)
> > {
> > struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> > - int special;
> > + int seq;
> >
> > /*
> > * CPUs seeing atomic_inc_return() must see prior idle sojourns,
> > * and we also must force ordering with the next RCU read-side
> > * critical section.
> > */
> > - special = atomic_inc_return(&rdtp->dynticks);
> > - WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(special & 0x1));
> > + seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdtp->dynticks);
>
> Likewise.
>
> - Josh Triplett
>