Re: [PATCH 10/15] sched: Check for sched_mn_power_savings whendoing load balancing

From: Vaidyanathan Srinivasan
Date: Mon Aug 24 2009 - 11:41:04 EST


* Peter Zijlstra <peterz@xxxxxxxxxxxxx> [2009-08-24 17:03:40]:

> On Thu, 2009-08-20 at 15:41 +0200, Andreas Herrmann wrote:
> > The patch adds support for POWERSAVINGS_BALANCE_BASIC for MN domain
> > level. Currently POWERSAVINGS_BALANCE_WAKEUP is not used for MN domain.
> >
> > (I have to admit that so far I don't have the correct understanding
> > what's the benefit of POWERSAVINGS_BALANCE_WAKEUP (when an deticated
> > wakeup CPU is used) in contrast to POWERSAVINGS_BALANCE_BASIC. I also
> > have not found an example that would demonstrate the difference
> > between those two powersaving levels.)
>
> blame svaidy for not writing enough comments ;-)

I am here to explain ;)

> iirc it moves tasks to sched_mv_preferred_wakeup_cpu instead of waking
> an idle cpu, this leaves idle cpus idle longer at the cost of creating
> overload on other cpus.

Yes, as Peter said, the POWERSAVINGS_BALANCE_WAKEUP biases task
wakeups to sched_mc_preferred_wakeup_cpu which has been nominated from
previous load balance loops.

Task wakeup biasing of sched_mc=2 works for most workloads like
kernbench and other sleeping tasks that come in and out of runqueue.
The default sched_mc=1 will work only for jobs running much longer
than the loadbalance interval or almost 100% CPU intensive job where
the load balancer can take time to identify the load pattern and
initiate a task migrate.

The wakeup biasing (sched_mc=2) will help move bursty jobs faster and
statistically pack them in single package and save power.

> > Signed-off-by: Andreas Herrmann <andreas.herrmann3@xxxxxxx>
> > ---
> > kernel/sched.c | 5 +++--
> > 1 files changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched.c b/kernel/sched.c
> > index ebcda58..7a0d710 100644
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -4591,7 +4591,8 @@ static int find_new_ilb(int cpu)
> > * Have idle load balancer selection from semi-idle packages only
> > * when power-aware load balancing is enabled
> > */
> > - if (!(sched_smt_power_savings || sched_mc_power_savings))
> > + if (!(sched_smt_power_savings || sched_mc_power_savings ||
> > + sched_mn_power_savings))
> > goto out_done;
> >
> > /*
> > @@ -4681,7 +4682,7 @@ int select_nohz_load_balancer(int stop_tick)
> > int new_ilb;
> >
> > if (!(sched_smt_power_savings ||
> > - sched_mc_power_savings))
> > + sched_mc_power_savings || sched_mn_power_savings))
> > return 1;
> > /*
> > * Check to see if there is a more power-efficient


You can achieve the balancing effects by propagating the SD_ flags at
the right domain level with the same sysfs interface. At some point
we wanted to change to sched_power_savings=N and set the flags
according to system topology to provide consolidation at the right
sched_domain and save power.

--Vaidy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/