Re: [PATCH tip/core/rcu 14/15] time: RCU permitted to stop idleentry via softirq
From: Peter Zijlstra
Date: Thu Sep 06 2012 - 11:14:01 EST
On Thu, 2012-08-30 at 11:56 -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paul.mckenney@xxxxxxxxxx>
>
> The can_stop_idle_tick() function complains if a softirq vector is
> raised too late in the idle-entry process, presumably in order to
> prevent dangling softirq invocations from being delayed across the
> full idle period, which might be indefinitely long -- and if softirq
> was asserted any later than the call to this function, such a delay
> might well happen.
>
> However, RCU needs to be able to use softirq to stop idle entry in
> order to be able to drain RCU callbacks from the current CPU, which in
> turn enables faster entry into dyntick-idle mode, which in turn reduces
> power consumption. Because RCU takes this action at a well-defined
> point in the idle-entry path, it is safe for RCU to take this approach.
>
> This commit therefore silences the error message that is sometimes
> produced when the going-idle CPU suddenly finds that it has an RCU_SOFTIRQ
> to process. The error message will continue to be issued for other
> softirq vectors.
>
> Reported-by: Sedat Dilek <sedat.dilek@xxxxxxxxx>
> Signed-off-by: Paul E. McKenney <paul.mckenney@xxxxxxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Tested-by: Sedat Dilek <sedat.dilek@xxxxxxxxx>
> ---
> include/linux/interrupt.h | 2 ++
> kernel/time/tick-sched.c | 3 ++-
> 2 files changed, 4 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index c5f856a..5e4e617 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -430,6 +430,8 @@ enum
> NR_SOFTIRQS
> };
>
> +#define SOFTIRQ_STOP_IDLE_MASK (~(1 << RCU_SOFTIRQ))
> +
> /* map softirq index to softirq name. update 'softirq_to_name' in
> * kernel/softirq.c when adding a new softirq.
> */
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 024540f..4b1785a 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -436,7 +436,8 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts)
> if (unlikely(local_softirq_pending() && cpu_online(cpu))) {
> static int ratelimit;
>
> - if (ratelimit < 10) {
> + if (ratelimit < 10 &&
> + (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) {
> printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
> (unsigned int) local_softirq_pending());
> ratelimit++;
Urgh.. yuck. So either add a very verbose comment here on why its OK for
RCU (the changelog is rather vague about it), or try and come up with
something better.
Where does RCU flush the pending softirq? Does it flush all softirqs or
only the RCU one? Can we move the check after RCU does this so we can
avoid the special case?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/