Re: [RFC v5 4/6] sched/fair: Tune task wake-up logic to pack small background tasks on fewer cores

From: Parth Shah
Date: Wed Oct 09 2019 - 04:57:42 EST




On 10/8/19 10:22 PM, Dietmar Eggemann wrote:
> [- Quentin Perret <quentin.perret@xxxxxxx>]
> [+ Quentin Perret <qperret@xxxxxxxxxxx>]
>
> See commit c193a3ffc282 ("mailmap: Update email address for Quentin Perret")
>

noted. thanks for notifying me.

> On 07/10/2019 18:53, Parth Shah wrote:
>>
>>
>> On 10/7/19 5:49 PM, Vincent Guittot wrote:
>>> On Mon, 7 Oct 2019 at 10:31, Parth Shah <parth@xxxxxxxxxxxxx> wrote:
>>>>
>>>> The algorithm finds the first non idle core in the system and tries to
>>>> place a task in the idle CPU in the chosen core. To maintain
>>>> cache hotness, work of finding non idle core starts from the prev_cpu,
>>>> which also reduces task ping-pong behaviour inside of the core.
>>>>
>>>> Define a new method to select_non_idle_core which keep tracks of the idle
>>>> and non-idle CPUs in the core and based on the heuristics determines if the
>>>> core is sufficiently busy to place the incoming backgroung task. The
>>>> heuristic further defines the non-idle CPU into either busy (>12.5% util)
>>>> CPU and overutilized (>80% util) CPU.
>>>> - The core containing more idle CPUs and no busy CPUs is not selected for
>>>> packing
>>>> - The core if contains more than 1 overutilized CPUs are exempted from
>>>> task packing
>>>> - Pack if there is atleast one busy CPU and overutilized CPUs count is <2
>>>>
>>>> Value of 12.5% utilization for busy CPU gives sufficient heuristics for CPU
>>>> doing enough work an
>
> [...]
>
>>>> @@ -6483,7 +6572,11 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f
>>>> } else if (sd_flag & SD_BALANCE_WAKE) { /* XXX always ? */
>>>> /* Fast path */
>>>>
>>>> - new_cpu = select_idle_sibling(p, prev_cpu, new_cpu);
>>>> + if (is_turbosched_enabled() && unlikely(is_background_task(p)))
>>>> + new_cpu = turbosched_select_non_idle_core(p, prev_cpu,
>>>> + new_cpu);
>>>
>>> Could you add turbosched_select_non_idle_core() similarly to
>>> find_energy_efficient_cpu() ?
>>> Add it at the beg select_task_rq_fair()
>>> Return immediately with theCPU if you have found one
>>> Or let the normal path select a CPU if the
>>> turbosched_select_non_idle_core() has not been able to find a suitable
>>> CPU for packing
>>>
>>
>> of course. I can do that.
>> I was just not aware about the effect of wake_affine and so was waiting for
>> such comments to be sure of. Thanks for this.
>> Maybe I can add just below the sched_energy_present(){...} construct giving
>> precedence to EAS? I'm asking this because I remember Patrick telling me to
>> leverage task packing for android as well?
>
> I have a hard time imaging that Turbosched will be used in Android next
> to EAS in the foreseeable future.
>
> First of all, EAS provides task packing already on Performance Domain
> (PD) level (a.k.a. as cluster on traditional 2-cluster Arm/Arm64
> big.LITTLE or DynamIQ (with Phantom domains (out of tree solution)).
> This is where we can safe energy without harming latency.
>
> See the tests results under '2.1 Energy test case' in
>
> https://lore.kernel.org/r/20181203095628.11858-1-quentin.perret@xxxxxxx
>
> There are 10 to 50 small (classified solely by task utilization) tasks
> per test case and EAS shows an effect on energy consumption by packing
> them onto the PD (cluster) of the small CPUs.
>
> And second, the CPU supported topology is different to the one you're
> testing on.
>

cool. I was just keeping in mind the following quote
" defining a generic spread-vs-pack wakeup policy which is something
Android also could benefit from " (https://lkml.org/lkml/2019/6/28/628)

BTW, IIUC that does task consolidation only on single CPU unless
rd->overload is set, right?

> [...]
>