Re: nohz problem with idle time on old hardware
From: Viresh Kumar
Date: Wed Apr 09 2014 - 09:52:18 EST
On Thu, Nov 14, 2013 at 1:31 AM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> Subject: NOHZ: Check for nohz active instead of nohz enabled
>
> RCU and the fine grained idle time accounting functions check
> tick_nohz_enabled. But that variable is merily telling that NOHZ has
> been enabled in the config and not been disabled on the command line.
>
> But it does not tell anything about nohz being active. That's what all
> this should check for.
>
> Matthew reported, that the idle accounting on his old P1 machine
> showed bogus values, when he enabled NOHZ in the config and did not
> disable it on the kernel command line. The reason is that his machine
> uses (refined) jiffies as a clocksource which explains why the "fine"
> grained accounting went into lala land, because it depends on when the
> system goes and leaves idle relative to the jiffies increment.
>
> Provide a tick_nohz_active indicator and let RCU and the accounting
> code use this instead of tick_nohz_enable.
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> @@ -973,7 +968,7 @@ static void tick_nohz_switch_to_nohz(void)
> struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
> ktime_t next;
>
> - if (!tick_nohz_enabled)
> + if (!tick_nohz_active)
> return;
Considering the impressive list of Reviewed-by and people involved
in this patch, I am not sure I am reading the code well here.
The above change isn't required as per my understanding. Otherwise
we will never pass that check. tick_nohz_active is initialized as zero
and so we will keep on returning for ever and wouldn't be able to set
it to 1 ever.
I have a patch to fix it up, but wanted to know your opinion before
sending it.
> local_irq_disable();
> @@ -981,7 +976,7 @@ static void tick_nohz_switch_to_nohz(void)
> local_irq_enable();
> return;
> }
> -
> + tick_nohz_active = 1;
> ts->nohz_mode = NOHZ_MODE_LOWRES;
>
> /*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/