Re: [PATCH 06/16] sched: Disable WAKE_AFFINE for asymmetric configurations
From: Morten Rasmussen
Date: Tue May 24 2016 - 11:01:30 EST
On Tue, May 24, 2016 at 03:52:00PM +0200, Vincent Guittot wrote:
> On 24 May 2016 at 15:36, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> > On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote:
> >> On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> >> > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote:
> >> >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> >> >> > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote:
> >> >> >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> >> >> >> > If the system has cpu of different compute capacities (e.g. big.LITTLE)
> >> >> >> > let affine wakeups be constrained to cpus of the same type.
> >> >> >>
> >> >> >> Can you explain why you don't want wake affine with cpus with
> >> >> >> different compute capacity ?
> >> >> >
> >> >> > I should have made the overall idea a bit more clear. The idea is to
> >> >> > deal with cross-capacity migrations in the find_idlest_{group, cpu}{}
> >> >> > path so we don't have to touch select_idle_sibling().
> >> >> > select_idle_sibling() is critical for wake-up latency, and I'm assumed
> >> >> > that people wouldn't like adding extra overhead in there to deal with
> >> >> > capacity and utilization.
> >> >>
> >> >> So this means that we will never use the quick path of
> >> >> select_idle_sibling for cross capacity migration but always the one
> >> >> with extra overhead?
> >> >
> >> > Yes. select_idle_sibling() is only used to choose among equal capacity
> >> > cpus (capacity_orig).
> >> >
> >> >> Patch 9 adds more tests for enabling wake_affine path. Can't it also
> >> >> be used for cross capacity migration ? so we can use wake_affine if
> >> >> the task or the cpus (even with different capacity) doesn't need this
> >> >> extra overhead
> >> >
> >> > The test in patch 9 is to determine whether we are happy with the
> >> > capacity of the previous cpu, or we should go look for one with more
> >> > capacity. I don't see how we can use select_idle_sibling() unmodified
> >> > for sched domains containing cpus of different capacity to select an
> >> > appropriate cpu. It is just picking an idle cpu, it might have high
> >> > capacity or low, it wouldn't care.
> >> >
> >> > How would you avoid the overhead of checking capacity and utilization of
> >> > the cpus and still pick an appropriate cpu?
> >>
> >> My point is that there is some wake up case where we don't care about
> >> the capacity and utilization of cpus even for cross capacity migration
> >> and we will never take benefit of this fast path.
> >> You have added an extra check for setting want_affine in patch 9 which
> >> uses capacity and utilization of cpu to disable this fast path when a
> >> task needs more capacity than available. Can't you use this function
> >> to disable the want_affine for cross-capacity migration situation that
> >> cares of the capacity and need the full scan of sched_domain but keep
> >> it enable for other cases ?
> >
> > It is not clear to me what the other cases are. What kind of cases do
> > you have in mind?
>
> As an example, you have a task A that have to be on a big CPU because
> of the requirement of compute capacity, that wakes up a task B that
> can run on any cpu according to its utilization. The fast wake up path
> is fine for task B whatever prev cpu is.
In that case, we will take always take fast path (select_idle_sibling())
for task B if wake_wide() allows it, which should be fine.
wake_cap() will return true as the B's prev_cpu is either a big cpu
(first criteria) or have sufficient capacity for B (second criteria).
Given that wake_wide() allows returns false as well and there are no
restrictions, want_affine will be true. Depending on where wake_affine()
sends us, we will use select_idle_sibling() to search around B's
prev_cpu or this cpu (where task A is running).
We avoid the overhead of looking for cpu capacity and utilization, but
we have restricted the search space for select_idle_sibling(). In case
B's prev_cpu is a little cpu, the choice whether we looks for little or
big capacity cpus depends on the wake_affine()'s decision. So the search
space isn't as wide as it could be.
To expand the search space we would have be able to adjust the
sched_domain level at which select_idle_sibling() is operating, so we
can look at same-capacity cpus only in the fast path for tasks like A,
and look at all cpus for tasks like B. It could possibly be done, if we
dare touching select_idle_sibling() ;-) I still have to look at those
patches PeterZ posted a while back.
TLDR; The fast path should already be used for task B, but the cpu
search space is restricted to a specific subset of cpus selected by
wake_affine() which isn't ideal, but much less invasive in terms of code
changes.