Re: [PATCH] sched/fair: Sync task util before slow-path wakeup
From: Morten Rasmussen
Date: Mon Aug 07 2017 - 08:52:16 EST
On Wed, Aug 02, 2017 at 03:24:05PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 02, 2017 at 02:10:02PM +0100, Brendan Jackman wrote:
> > We use task_util in find_idlest_group via capacity_spare_wake. This
> > task_util is updated in wake_cap. However wake_cap is not the only
> > reason for ending up in find_idlest_group - we could have been sent
> > there by wake_wide. So explicitly sync the task util with prev_cpu
> > when we are about to head to find_idlest_group.
> > We could simply do this at the beginning of
> > select_task_rq_fair (i.e. irrespective of whether we're heading to
> > select_idle_sibling or find_idlest_group & co), but I didn't want to
> > slow down the select_idle_sibling path more than necessary.
> > Don't do this during fork balancing, we won't need the task_util and
> > we'd just clobber the last_update_time, which is supposed to be 0.
> So I remember Morten explicitly not aging util of tasks on wakeup
> because the old util was higher and better representative of what the
> new util would be, or something along those lines.
That was the intention, but when we discussed the wake_cap() stuff we
decided to drop that hoping that decay clamping or some other magic
would be added on top later. So this patch is in line with current
Using non-aged util is causing trouble when comparing prev_cpu to other
cpus. In cpu_util_wake() we compensate for the fact that the aged task
util is already included in the cpu util on the prev_cpu. For that to
work, we need to age the task util so we know how much is already
accounted for. In the original wake_cap() series I think I had a patch
that store the non-aged version so we could calculate the potential cpu
cpu_util(prev_cpu) - task_util_aged(task) + task_util_nonaged(task)
cpu_util(other_cpu) + task_util_nonaged(task)
This would be better always under-estimating the task util by using the
aged util as we currently do:
cpu_util(prev_cpu) - task_util_aged(task) + task_util_aged(task)
cpu_util(other_cpu) + task_util_aged(task)
but at least it gives us a fair comparison between prev_cpu and other
The Android kernel carries additional patches that tracks the max (peak)
utilization and uses that as the non aged util for wake-up placement.
I'm hoping we can discuss this topic again at LPC, as last years idea of
clamping decay didn't work very well to solve this issue.