Re: [PATCH RFC nohz_full 7/7] nohz_full: Force RCU's grace-periodkthreads onto timekeeping CPU

From: Paul E. McKenney
Date: Mon Jul 29 2013 - 13:53:44 EST

On Mon, Jul 29, 2013 at 06:59:46PM +0200, Frederic Weisbecker wrote:
> On Mon, Jul 29, 2013 at 09:52:53AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 29, 2013 at 11:36:05AM +0800, Lai Jiangshan wrote:
> > However, on small systems, rcu_sysidle_check_cpu() can be called from
> > the timekeeping CPU. I suppose that this could potentially happen
> > before the first grace period starts, and in that case, we could
> > potentially see a spurious warning. I could imagine a number of ways
> > to fix this:
> >
> > 1. Bind the kthread when it is created.
> >
> > 2. Bind the kthread when it first starts running, rather than just
> > after the grace period starts.
> >
> > 3. Suppress the warning when there is no grace period in progress.
> >
> > 4. Suppress the warning prior to the first grace period starting.
> >
> > Seems like #3 is the most straightforward approach. I just change it to:
> >
> > if (rcu_gp_in_progress(rdp->rsp))
> > WARN_ON_ONCE(smp_processor_id() != tick_do_timer_cpu);
> >
> > This still gets a WARN_ON_ONCE() if someone moves the timekeeping CPU,
> > but Frederic tells me that it never moves. My WARN_ON_ONCE() has some
> > probability of complaining should some bug creep in.
> It doesn't move for now but keep in mind that it will probably be able
> to move in the future. If we have several non full-dynticks CPUs, balancing
> the timekeeping duty between them, depending which one runs at a given time,
> may improve power savings even better.
> But you can ignore that for now. Your patchset is entertaining enough that
> we don't need to add more complications yet ;)

Yeah, we will need some sort of handshake for that. Might be a simple
as setting a flag that suppresses the warning, which I clear the next
time I bind the kthread. Well, it would need to deal with closely-spaced
moves of the timekeeping duty, wouldn't it? Plus it would need to deal
with the fact that sampling the variable referencing the timekeeping CPU,
sampling the current CPU, and binding the kthread cannot be done as one
big atomic operation. Which means that their would need to be two calls
in the handshake, one to prepare to move the timekeeping CPU and another
to announce that it had in fact been moved.

Which is not too hard -- I use an irq-disable lock to guard setting and
clearing an internal-to-RCU flag noting the upcoming change. The
check and WARN_ON_ONCE() are done while holding this same lock. The
flag is cleared only after the kthread-bind operation that follows the
last "it has in fact been moved" handshake. So the flag has three states,
idle, ready to move, and moved. The possibility of closely spaced moves
of the timekeeping kthread are dealt with by transitioning from "moved"
to "ready to move". The state goes back to "idle" only after completion
of a kthread-bind operation in the "moved" state.

So agreed, let's defer this one. ;-)

Thanx, Paul

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at