Re: [PATCH 06/16] sched: Disable WAKE_AFFINE for asymmetric configurations

From: Vincent Guittot
Date: Tue May 24 2016 - 11:53:56 EST


On 24 May 2016 at 17:02, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> On Tue, May 24, 2016 at 03:52:00PM +0200, Vincent Guittot wrote:
>> On 24 May 2016 at 15:36, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
>> > On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote:
>> >> On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
>> >> > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote:
>> >> >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
>> >> >> > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote:
>> >> >> >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
>> >> >> >> > If the system has cpu of different compute capacities (e.g. big.LITTLE)
>> >> >> >> > let affine wakeups be constrained to cpus of the same type.
>> >> >> >>
>> >> >> >> Can you explain why you don't want wake affine with cpus with
>> >> >> >> different compute capacity ?
>> >> >> >
>> >> >> > I should have made the overall idea a bit more clear. The idea is to
>> >> >> > deal with cross-capacity migrations in the find_idlest_{group, cpu}{}
>> >> >> > path so we don't have to touch select_idle_sibling().
>> >> >> > select_idle_sibling() is critical for wake-up latency, and I'm assumed
>> >> >> > that people wouldn't like adding extra overhead in there to deal with
>> >> >> > capacity and utilization.
>> >> >>
>> >> >> So this means that we will never use the quick path of
>> >> >> select_idle_sibling for cross capacity migration but always the one
>> >> >> with extra overhead?
>> >> >
>> >> > Yes. select_idle_sibling() is only used to choose among equal capacity
>> >> > cpus (capacity_orig).
>> >> >
>> >> >> Patch 9 adds more tests for enabling wake_affine path. Can't it also
>> >> >> be used for cross capacity migration ? so we can use wake_affine if
>> >> >> the task or the cpus (even with different capacity) doesn't need this
>> >> >> extra overhead
>> >> >
>> >> > The test in patch 9 is to determine whether we are happy with the
>> >> > capacity of the previous cpu, or we should go look for one with more
>> >> > capacity. I don't see how we can use select_idle_sibling() unmodified
>> >> > for sched domains containing cpus of different capacity to select an
>> >> > appropriate cpu. It is just picking an idle cpu, it might have high
>> >> > capacity or low, it wouldn't care.
>> >> >
>> >> > How would you avoid the overhead of checking capacity and utilization of
>> >> > the cpus and still pick an appropriate cpu?
>> >>
>> >> My point is that there is some wake up case where we don't care about
>> >> the capacity and utilization of cpus even for cross capacity migration
>> >> and we will never take benefit of this fast path.
>> >> You have added an extra check for setting want_affine in patch 9 which
>> >> uses capacity and utilization of cpu to disable this fast path when a
>> >> task needs more capacity than available. Can't you use this function
>> >> to disable the want_affine for cross-capacity migration situation that
>> >> cares of the capacity and need the full scan of sched_domain but keep
>> >> it enable for other cases ?
>> >
>> > It is not clear to me what the other cases are. What kind of cases do
>> > you have in mind?
>>
>> As an example, you have a task A that have to be on a big CPU because
>> of the requirement of compute capacity, that wakes up a task B that
>> can run on any cpu according to its utilization. The fast wake up path
>> is fine for task B whatever prev cpu is.
>
> In that case, we will take always take fast path (select_idle_sibling())
> for task B if wake_wide() allows it, which should be fine.

Even if want_affine is set, the wake up of task B will not use the fast path.
The affine_sd will not be set because the sched_domain, which have
both cpus, will not have the SD_WAKE_AFFINE flag according to this
patch, isn't it ?
So task B can't use the fast path whereas nothing prevent him to take
benefit of it

Am I missing something ?

>
> wake_cap() will return true as the B's prev_cpu is either a big cpu
> (first criteria) or have sufficient capacity for B (second criteria).
> Given that wake_wide() allows returns false as well and there are no
> restrictions, want_affine will be true. Depending on where wake_affine()
> sends us, we will use select_idle_sibling() to search around B's
> prev_cpu or this cpu (where task A is running).
>
> We avoid the overhead of looking for cpu capacity and utilization, but
> we have restricted the search space for select_idle_sibling(). In case
> B's prev_cpu is a little cpu, the choice whether we looks for little or
> big capacity cpus depends on the wake_affine()'s decision. So the search
> space isn't as wide as it could be.
>
> To expand the search space we would have be able to adjust the
> sched_domain level at which select_idle_sibling() is operating, so we
> can look at same-capacity cpus only in the fast path for tasks like A,
> and look at all cpus for tasks like B. It could possibly be done, if we
> dare touching select_idle_sibling() ;-) I still have to look at those
> patches PeterZ posted a while back.
>
> TLDR; The fast path should already be used for task B, but the cpu
> search space is restricted to a specific subset of cpus selected by
> wake_affine() which isn't ideal, but much less invasive in terms of code
> changes.