Re: high power consumption in recent kernels

From: Peter Zijlstra
Date: Thu Sep 30 2010 - 08:36:39 EST


On Thu, 2010-09-30 at 16:27 +0800, Alex,Shi wrote:
> Thanks Norbert!
> Mike&Peter:
> would you like to add signed-off for the following patch?

I would really rather see the nohz decision move into the whole cpuidle
governor thing, which can make a much better cost vs benefit decision.

Thomas, Arjan?

> ---
> sched: nohz_ratelimit function refresh
>
> The nohz_ratelimit() function that written by Mike Galbraith
> can bring about more than 10% throughput for netperf TCP/UDP RR
> when scheduling cross-cpu. It did this by reducing down to nohz
> mode chance.
> But the patch also reduce CPU chance to nohz mode after
> interrupt processed, that cause Norbert's system have 4 watts power
> increase(the system have about 100 int/sec and with a light load).
> That is not acceptable for a laptop.
>
> So, I remove the nohz_ratelimit from irq_exit(). and then the
> Norbert's system back to low power consumption.
>
> Tested-by: Norbert Preining <preining@xxxxxxxx>
> Signed-off-by: Alex Shi <alex.shi@xxxxxxxxx>
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 1e2a6db..a4dbb37 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -274,8 +274,13 @@ extern cpumask_var_t nohz_cpu_mask;
> #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ)
> extern void select_nohz_load_balancer(int stop_tick);
> extern int get_nohz_timer_target(void);
> +extern int nohz_ratelimit(int cpu);
> #else
> static inline void select_nohz_load_balancer(int stop_tick) { }
> +static inline int nohz_ratelimit(int cpu)
> +{
> + return 0;
> +}
> #endif
>
> /*
> diff --git a/kernel/sched.c b/kernel/sched.c
> index dc85ceb..132a21c 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -1182,6 +1182,16 @@ static void resched_task(struct task_struct *p)
> smp_send_reschedule(cpu);
> }
>
> +int nohz_ratelimit(int cpu)
> +{
> + struct rq *rq = cpu_rq(cpu);
> + u64 diff = rq->clock - rq->nohz_stamp;
> +
> + rq->nohz_stamp = rq->clock;
> +
> + return diff < (NSEC_PER_SEC / HZ) >> 1;
> +}
> +
> static void resched_cpu(int cpu)
> {
> struct rq *rq = cpu_rq(cpu);
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 3e216e0..19a7914 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -325,7 +325,7 @@ void tick_nohz_stop_sched_tick(int inidle)
> } while (read_seqretry(&xtime_lock, seq));
>
> if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
> - arch_needs_cpu(cpu)) {
> + arch_needs_cpu(cpu) || (inidle && nohz_ratelimit(cpu))) {
> next_jiffies = last_jiffies + 1;
> delta_jiffies = 1;
> } else {
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/