Re: [rfc patch] sched/fair: Use instantaneous load for fork/exec balancing

From: Dietmar Eggemann
Date: Wed Jun 15 2016 - 11:33:06 EST


On 14/06/16 17:40, Mike Galbraith wrote:
> On Tue, 2016-06-14 at 15:14 +0100, Dietmar Eggemann wrote:
>
>> IMHO, the hackbench performance "boost" w/o 0905f04eb21f is due to the
>> fact that a new task gets all it's load decayed (making it a small task)
>> in the __update_load_avg() call in remove_entity_load_avg() because its
>> se->avg.last_update_time value is 0 which creates a huge time difference
>> comparing it to cfs_rq->avg.last_update_time. The patch 0905f04eb21f
>> avoids this and thus the task stays big se->avg.load_avg = 1024.
>
> I don't care much at all about the hackbench "regression" in its own
> right, and what causes it, for me, bottom line is that there are cases
> where we need to be able to resolve, and can't, simply because we're
> looking at a fuzzy (rippling) reflection.

Understood. I just thought it would be nice to know why 0905f04eb21f
makes this problem even more visible. But so far I wasn't able to figure
out why this diff in se->avg.load_avg [1024 versus 0] has this effect on
cfs_rq->runnable_load_avg making it even less suitable in find idlest*.
enqueue_entity_load_avg()'s cfs_rq->runnable_load_* += sa->load_* looks
suspicious though.
>
> In general, the fuzz helps us to not be so spastic. I'm not sure that
> we really really need to care all that much, because I strongly suspect
> that it's only gonna make any difference at all in corner cases, but
> there are real world cases that matter. I know for fact that schbench
> (facebook) which is at least based on a real world load fails early due
> to us stacking tasks due to that fuzzy view of reality. In that case,
> it's because the fuzz consists of a high amplitude aging sawtooth..

... only for fork/exec? Which then would be related to the initial value
of se->avg.load_avg. Otherwise we could go back to pre b92486cbf2aa
"sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task".

> find idlest* sees a collection of pesudo-random numbers, effectively,
> the fates pick idlest via lottery, get it wrong often enough that a big
> box _never_ reaches full utilization before we stack tasks, putting an
> end to the latency game. For generic loads, the smoothing works, but
> for some corners, it blows chunks. Fork/exec seemed like a spot where
> you really can't go wrong by looking at clear unadulterated reality.
>
> -Mike
>