Re: [PATCH] v4 Teach RCU that idle task is not quiscent state at boot

From: Vegard Nossum
Date: Wed Feb 25 2009 - 13:38:41 EST


2009/2/25 Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>:
> This patch fixes a bug located by Vegard Nossum with the aid of
> kmemcheck, updated based on review comments from Nick Piggin,
> Ingo Molnar, and Andrew Morton.
>
> The boot CPU runs in the context of its idle thread during boot-up.
> During this time, idle_cpu(0) will always return nonzero, which will
> fool Classic and Hierarchical RCU into deciding that a large chunk of
> the boot-up sequence is a big long quiescent state. ÂThis in turn causes
> RCU to prematurely end grace periods during this time.
>
> This patch changes the rcutree.c and rcuclassic.c rcu_check_callbacks()
> function to ignore the idle task as a quiescent state until the
> system has started up the scheduler in rest_init(), introducing a
> new non-API function rcu_idle_now_means_idle() to inform RCU of this
> transition. ÂRCU maintains an internal rcu_idle_cpu_truthful variable
> to track this state, which is then used by rcu_check_callback() to
> determine if it should believe idle_cpu().
>
> Because this patch has the effect of disallowing RCU grace periods
> during long stretches of the boot-up sequence, this patch also introduces
> Josh Triplett's UP-only optimization that makes synchronize_rcu() be a
> no-op if num_online_cpus() returns 1. ÂThis allows boot-time code that
> calls synchronize_rcu() to proceed normally. ÂNote, however, that RCU
> callbacks registered by call_rcu() will likely queue up until later in
> the boot sequence. ÂAlthough rcuclassic and rcutree can also use this
> same optimization after boot completes, rcupreempt must restrict its
> use of this optimization to the portion of the boot sequence before the
> scheduler starts up, given that an rcupreempt RCU read-side critical
> section may be preeempted.
>
> In addition, this patch takes Nick Piggin's suggestion to make the
> system_state global variable be __read_mostly.
>
> Changes since v3:
>
> o    WARN_ON(nr_context_switches() > 0) to verify that RCU
> Â Â Â Âswitches out of boot-time mode before the first context
> Â Â Â Âswitch, as suggested by Nick Piggin.
>
> Changes since v2:
>
> o    Created rcu_blocking_is_gp() internal-to-RCU API that
> Â Â Â Âdetermines whether a call to synchronize_rcu() is itself
> Â Â Â Âa grace period.
>
> o    The definition of rcu_blocking_is_gp() for rcuclassic and
> Â Â Â Ârcutree checks to see if but a single CPU is online.
>
> o    The definition of rcu_blocking_is_gp() for rcupreempt
> Â Â Â Âchecks to see both if but a single CPU is online and if
> Â Â Â Âthe system is still in early boot.
>
> Â Â Â ÂThis allows rcupreempt to again work correctly if running
> Â Â Â Âon a single CPU after booting is complete.
>
> o    Added check to rcupreempt's synchronize_sched() for there
> Â Â Â Âbeing but one online CPU.
>
> Tested all three variants both SMP and !SMP, booted fine, passed a short
> rcutorture test on both x86 and Power.
>
> Located-by: Vegard Nossum <vegard.nossum@xxxxxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>

Tested-by: Vegard Nossum <vegard.nossum@xxxxxxxxx>

I used such a patch and config (as attached) to verify. On unpatched
kernel, I get this:

rcu callbacks called: 2300 (should be 4000)

And with the patch, it seems better:

rcu callbacks called: 4000 (should be 4000)

Actually, the first part of the text itself is wrong, because it used
to count upwards. But it should really say "rcu callbacks NOT called".

Of course, this kind of check will always be inaccurate, so it is not
fit for mainline. The inaccuracy stems from the fact that there could
be other users of RCU that disturb our expected behaviour, and we
don't know how quickly the callbacks will be run, only that if
"enough" of them run inside the spinlock, something is probably wrong.
But... what am I saying, you're the expert ;-)

But it is useful enough to verify your patch. Some other configs I
tried gave the correct result all the time, but I don't know what is
special about this one.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

Attachment: .config
Description: Binary data

Attachment: 0001-rcu-boot-self-test.patch
Description: Binary data