Re: [RFC PATCH 1/2] sched: Clean up SD_BALANCE_WAKE flags in sched domain build-up

From: Vincent Guittot
Date: Wed Jun 01 2016 - 05:25:16 EST

On 1 June 2016 at 03:03, Yuyang Du <yuyang.du@xxxxxxxxx> wrote:
> On Wed, Jun 01, 2016 at 10:32:53AM +0200, Vincent Guittot wrote:
>> > Yup. Up to this point, we don't have any disagreement. And I don't think we
>> > have any disagreement conceptually. What the next patch really does is:
>> >
>> > (1) we don't remove SD_BALANCE_WAKE as an important sched_domain flag, on
>> > the contrary, we strengthen it.
>> >
>> > (2) the semantic of SD_BALANCE_WAKE is currently represented by SD_WAKE_AFFINE,
>> > we actually remove this representation.
>> >
>> > (3) regarding the semantic of SD_WAKE_AFFINE, it is really not about selecting
>> > waker CPU or about the fast path. Conceptually, it is just saying the waker
>> > CPU is a valid and important candidate if SD_BALANCE_WAKE, which is just so
>> > obvious, so I don't think it deserves to be a separate sched_domain flag.
>> >
>> > (4) the outcome is, if SD_BALANCE_WAKE, we definitely will/should try waker CPU,
>> > and if !SD_BALANCE_WAKE, we don't try waker CPU. So nothing functional is
>> > changed.
>> AFAIU, there is 4 possible cases during wake up:
>> - we don't want any balance at wake so we don't have SD_BALANCE_WAKE
>> nor SD_WAKE_AFFINE in sched_domain->flags
>> - we only want wake affine balance check so we only have
>> SD_WAKE_AFFINE in sched_domain->flags
>> - we want wake_affine and full load balance at wake so we have both
>> SD_BALANCE_WAKE and SD_WAKE_AFFINE in sched_domain->flags
>> - we want full load balance but want to skip wake affine fast path so
>> we only have SD_BALANCE_WAKE in sched_domain->flags
>> I'm not sure that we can still do only wake_affine or only full
>> load_balance with your changes whereas these sequences are valid ones
> So with the patch, we will have a little bit semantic change, SD_BALANCE_WAKE
> implies SD_WAKE_AFFINE if allowed, and will favor "fast path" if possible. I don't
> think we should do anything otherwise.

Why should we not do anything else ?

The current default configuration is to only use the wake_affine path.
With your changes, the default configuration will try to use wake
affine and will fall back to long load balance sequence if wake affine
doesn't find a sched_domain

That's a major changes in the behavior

> So I think this is a combined case better than either of the "only wake_affine"
> or "only full" cases. Make sense?