Re: [PATCH v2 6/6] sched/deadline: Implement fallback mechanism for !fit case
From: Dietmar Eggemann
Date: Wed Apr 29 2020 - 13:39:58 EST
On 27/04/2020 16:17, luca abeni wrote:
> Hi Juri,
>
> On Mon, 27 Apr 2020 15:34:38 +0200
> Juri Lelli <juri.lelli@xxxxxxxxxx> wrote:
>
>> Hi,
>>
>> On 27/04/20 10:37, Dietmar Eggemann wrote:
>>> From: Luca Abeni <luca.abeni@xxxxxxxxxxxxxxx>
>>>
>>> When a task has a runtime that cannot be served within the
>>> scheduling deadline by any of the idle CPU (later_mask) the task is
>>> doomed to miss its deadline.
>>>
>>> This can happen since the SCHED_DEADLINE admission control
>>> guarantees only bounded tardiness and not the hard respect of all
>>> deadlines. In this case try to select the idle CPU with the largest
>>> CPU capacity to minimize tardiness.
>>>
>>> Signed-off-by: Luca Abeni <luca.abeni@xxxxxxxxxxxxxxx>
>>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> [...]
>>> - if (!cpumask_empty(later_mask))
>>> - return 1;
>>> + if (cpumask_empty(later_mask))
>>> + cpumask_set_cpu(max_cpu, later_mask);
>>
>> Think we touched upon this during v1 review, but I'm (still?)
>> wondering if we can do a little better, still considering only free
>> cpus.
>>
>> Can't we get into a situation that some of the (once free) big cpus
>> have been occupied by small tasks and now a big task enters the
>> system and it only finds small cpus available, were it could have fit
>> into bigs if small tasks were put onto small cpus?
>>
>> I.e., shouldn't we always try to best fit among free cpus?
>
> Yes; there was an additional patch that tried schedule each task on the
> slowest core where it can fit, to address this issue.
> But I think it will go in a second round of patches.
Yes, we can run into this situation in DL, but also in CFS or RT.
IMHO, this patch is aligned with the Capacity Awareness implementation
in CFS and RT.
Capacity Awareness so far is 'find a CPU which fits the requirement of
the task (Req)'. It's not (yet) find the best CPU.
CFS - select_idle_capacity() -> task_fits_capacity()
Req: util(p) * 1.25 < capacity_of(cpu)
RT - select_task_rq_rt(), cpupri_find_fitness() ->
rt_task_fits_capacity()
Req: uclamp_eff_value(p) <= capacity_orig_of(cpu)
DL - select_task_rq_dl(), cpudl_find() -> dl_task_fits_capacity()
Req: dl_runtime(p)/dl_deadline(p) * 1024 <= capacity_orig_of(cpu)
There has to be an "idle" (from the viewpoint of the task) CPU available
with a fitting capacity. Otherwise a fallback mechanism applies.
CFS - best capacity handling in select_idle_capacity().
RT - Non-fitting lowest mask
DL - This patch
You did spot the rt-app 'delay' for the small tasks in the test case ;-)