Re: [PATCH 2/2] sched/fair: Always propagate runnable_load_avg
From: Tejun Heo
Date: Wed Apr 26 2017 - 18:52:15 EST
Hello,
On Wed, Apr 26, 2017 at 08:12:09PM +0200, Vincent Guittot wrote:
> On 24 April 2017 at 22:14, Tejun Heo <tj@xxxxxxxxxx> wrote:
> Can the problem be on the load balance side instead ? and more
> precisely in the wakeup path ?
> After looking at the trace, it seems that task placement happens at
> wake up path and if it fails to select the right idle cpu at wake up,
> you will have to wait for a load balance which is alreayd too late
Oh, I was tracing most of scheduler activities and the ratios of
wakeups picking idle CPUs were about the same regardless of cgroup
membership. I can confidently say that the latency issue that I'm
seeing is from load balancer picking the wrong busiest CPU, which is
not to say that there can be other problems.
> > another queued wouldn't report the correspondingly higher
>
> It will as load_avg includes the runnable_load_avg so whatever load is
> in runnable_load_avg will be in load_avg too. But at the contrary,
> runnable_load_avg will not have the blocked that is going to wake up
> soon in the case of schbench
Decaying contribution of blocked tasks don't affect the busiest CPU
selection. Without cgroup, runnable_load_avg is immediately increased
and decreased as tasks enter and leave the queue and otherwise we end
up with CPUs which are idle when there are threads queued on different
CPUs accumulating scheduling latencies.
The patch doesn't change how the busiest CPU is picked. It already
uses runnable_load_avg. The change that cgroup causes is that it
blocks updates to runnable_load_avg from newly scheduled or sleeping
tasks.
The issue isn't about whether runnable_load_avg or load_avg should be
used but the unexpected differences in the metrics that the load
balancer uses depending on whether cgroup is used or not.
> One last thing, the load_avg of an idle CPU can stay blocked for a
> while (until a load balance happens that will update blocked load) and
> can be seen has "busy" whereas it is not. Could it be a reason of your
> problem ?
AFAICS, the load balancer doesn't use load_avg.
Thanks.
--
tejun