Re: [PATCH v5 03/12] sched: fix avg_load computation

From: Tim Chen
Date: Wed Sep 03 2014 - 19:43:30 EST


On Wed, 2014-09-03 at 13:09 +0200, Vincent Guittot wrote:
> On 30 August 2014 14:00, Preeti U Murthy <preeti@xxxxxxxxxxxxxxxxxx> wrote:
> > Hi Vincent,
> >
> > On 08/26/2014 04:36 PM, Vincent Guittot wrote:
> >> The computation of avg_load and avg_load_per_task should only takes into
> >> account the number of cfs tasks. The non cfs task are already taken into
> >> account by decreasing the cpu's capacity and they will be tracked in the
> >> CPU's utilization (group_utilization) of the next patches
> >>
> >> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> >> ---
> >> kernel/sched/fair.c | 4 ++--
> >> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 87b9dc7..b85e9f7 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -4092,7 +4092,7 @@ static unsigned long capacity_of(int cpu)
> >> static unsigned long cpu_avg_load_per_task(int cpu)
> >> {
> >> struct rq *rq = cpu_rq(cpu);
> >> - unsigned long nr_running = ACCESS_ONCE(rq->nr_running);
> >> + unsigned long nr_running = ACCESS_ONCE(rq->cfs.h_nr_running);
> >> unsigned long load_avg = rq->cfs.runnable_load_avg;
> >>
> >> if (nr_running)
> >> @@ -5985,7 +5985,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
> >> load = source_load(i, load_idx);
> >>
> >> sgs->group_load += load;
> >> - sgs->sum_nr_running += rq->nr_running;
> >> + sgs->sum_nr_running += rq->cfs.h_nr_running;
> >>
> >> if (rq->nr_running > 1)
> >> *overload = true;
> >>
> >
> > Why do we probe rq->nr_running while we do load balancing? Should not we
> > be probing cfs_rq->nr_running instead? We are interested after all in
> > load balancing fair tasks right? The reason I ask this is, I was
> > wondering if we need to make the above similar change in more places in
> > load balancing.
>
> Hi Preeti,
>
> Yes, we should probably the test rq->cfs.h_nr_running > 0 before
> setting overload.
>

The overload indicator is used for knowing when we can totally avoid
load balancing to a cpu that is about to go idle.
We can avoid load balancing when no cpu has more than 1 task. So if you
have say just one fair task and multiple deadline tasks on a cpu,
and another cpu about to go idle, you should turn on normal load
balancing in the idle path by setting overload to true.

So setting overload should be set based on rq->nr_running and not on
rq->cfs.h_nr_running.

Thanks.

Tim


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/