Re: [PATCH 03/15] sched,fair: redefine runnable_load_avg as the sum of task_h_load
From: Rik van Riel
Date: Wed Aug 28 2019 - 10:48:02 EST
On Wed, 2019-08-28 at 15:50 +0200, Vincent Guittot wrote:
> Hi Rik,
>
> On Thu, 22 Aug 2019 at 04:18, Rik van Riel <riel@xxxxxxxxxxx> wrote:
> > The runnable_load magic is used to quickly propagate information
> > about
> > runnable tasks up the hierarchy of runqueues. The runnable_load_avg
> > is
> > mostly used for the load balancing code, which only examines the
> > value at
> > the root cfs_rq.
> >
> > Redefine the root cfs_rq runnable_load_avg to be the sum of
> > task_h_loads
> > of the runnable tasks. This works because the hierarchical
> > runnable_load of
> > a task is already equal to the task_se_h_load today. This provides
> > enough
> > information to the load balancer.
> >
> > The runnable_load_avg of the cgroup cfs_rqs does not appear to be
> > used for anything, so don't bother calculating those.
> >
> > This removes one of the things that the code currently traverses
> > the
> > cgroup hierarchy for, and getting rid of it brings us one step
> > closer
> > to a flat runqueue for the CPU controller.
>
> I like your proposal but just wanted to clarify one thing with this
> patch.
> Although you removed the computation of runnable_load_avg of the
> cgroup cfs_rq, we are still traversing the hierarchy to update the
> root cfs_rq runnable_load_avg because we are traversing the hierarchy
> for computing the task_h_loads
The task_h_load hierarchy traversal in update_cfs_rq_h_load
is rate limited to once a jiffy, though. Rate limiting the
hierarchy traversal significantly reduces overhead.
> That being said, if we manage to remove the need on using
> runnable_load_avg we will completely skip this traversal. I have a
> proposal to remove it from load balance and wake up path but i
> haven't
> look at numa stats which also use it
--
All Rights Reversed.
Attachment:
signature.asc
Description: This is a digitally signed message part