Re: [RFC v5 4/6] sched/fair: Tune task wake-up logic to pack small background tasks on fewer cores
From: Parth Shah
Date: Wed Oct 09 2019 - 13:02:53 EST
On 10/9/19 7:56 PM, Dietmar Eggemann wrote:
> On 09/10/2019 10:57, Parth Shah wrote:
>
> [...]
>
>>> On 07/10/2019 18:53, Parth Shah wrote:
>>>>
>>>>
>>>> On 10/7/19 5:49 PM, Vincent Guittot wrote:
>>>>> On Mon, 7 Oct 2019 at 10:31, Parth Shah <parth@xxxxxxxxxxxxx> wrote:
>
> [...]
>
>>>> Maybe I can add just below the sched_energy_present(){...} construct giving
>>>> precedence to EAS? I'm asking this because I remember Patrick telling me to
>>>> leverage task packing for android as well?
>>>
>>> I have a hard time imaging that Turbosched will be used in Android next
>>> to EAS in the foreseeable future.
>>>
>>> First of all, EAS provides task packing already on Performance Domain
>>> (PD) level (a.k.a. as cluster on traditional 2-cluster Arm/Arm64
>>> big.LITTLE or DynamIQ (with Phantom domains (out of tree solution)).
>>> This is where we can safe energy without harming latency.
>>>
>>> See the tests results under '2.1 Energy test case' in
>>>
>>> https://lore.kernel.org/r/20181203095628.11858-1-quentin.perret@xxxxxxx
>>>
>>> There are 10 to 50 small (classified solely by task utilization) tasks
>>> per test case and EAS shows an effect on energy consumption by packing
>>> them onto the PD (cluster) of the small CPUs.
>>>
>>> And second, the CPU supported topology is different to the one you're
>>> testing on.
>>>
>>
>> cool. I was just keeping in mind the following quote
>> " defining a generic spread-vs-pack wakeup policy which is something
>> Android also could benefit from " (https://lkml.org/lkml/2019/6/28/628)
>
> The main thing is that in case we want to introduce a new functionality
> into CFS, we should try hard to use existing infrastructure (or
> infrastructure there is agreement on that we'll need it) as much as
> possible.
>
> If I understand Patrick here correctly, he suggested not to use uclamp
> but the task latency nice approach. There is agreement that we would
> need something like this as infrastructure:
>
> https://lore.kernel.org/r/20190830174944.21741-1-subhra.mazumdar@xxxxxxxxxx
>
got it.
> So p->latency_nice is suitable to include your p->flags |=
> PF_CAN_BE_PACKED concept nicely.
yeah, I'm working on that part too as a bigger goal.
>
>>
>> BTW, IIUC that does task consolidation only on single CPU unless
>> rd->overload is set, right?
>
> Task consolidation on Performance Domains (PDs) w/ multiple CPUs (e.g.
> on a per-cluster PD big.LITTLE system) only works when the system is not
> overutilized:
>
> 6326 int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
> 6327 {
> ...
> 6337 if (!pd || *READ_ONCE(rd->overutilized)*)
> 6338 goto fail;
> ...
ok. so does that mean TurboSched can still do some good in such systems as
well ?
I mean save energy even when rd->overutilized==1 by not waking user
classified bg tasks on idle core.
>
> [...]
>
Thanks,
Parth