Re: [patch] sched: Fix smp nice induced group scheduling load distribution woes

From: Mike Galbraith
Date: Thu Apr 28 2016 - 08:29:40 EST


On Thu, 2016-04-28 at 11:11 +0200, Peter Zijlstra wrote:
> On Wed, Apr 27, 2016 at 09:09:51AM +0200, Mike Galbraith wrote:
> > On even a modest sized NUMA box any load that wants to scale
> > is essentially reduced to SCHED_IDLE class by smp nice scaling.
> > Limit niceness to prevent cramming a box wide load into a too
> > small space. Given niceness affects latency, give the user the
> > option to completely disable box wide group fairness as well.
>
> Have you tried the (obvious) ?

Duh, nope.

> I suppose we really should just do this (and yuyang's cleanup patches I
> suppose). Nobody has ever been able to reproduce those increased power
> usage claims and Google is running with this enabled.

Yup, works, and you don't have to carefully blink as you skim past it.

> ---
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 69da6fcaa0e8..968f573413de 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -53,7 +53,7 @@ static inline void cpu_load_update_active(struct rq
> *this_rq) { }
> * when BITS_PER_LONG <= 32 are pretty high and the returns do not
> justify the
> * increased costs.
> */
> -#if 0 /* BITS_PER_LONG > 32 -- currently broken: it increases power
> usage under light load */
> +#ifdef CONFIG_64BIT
> # define SCHED_LOAD_RESOLUTION 10
> # define scale_load(w) ((w) << SCHED_LOAD_RESOLUTION)
> # define scale_load_down(w) ((w) >> SCHED_LOAD_RESOLUTION)