[SchedulerWakeupLatency] Skip energy aware task placement

From: Dietmar Eggemann
Date: Thu Jul 23 2020 - 05:31:47 EST


On 23/06/2020 09:29, Patrick Bellasi wrote:

> .:: Scheduler Wakeup Path Requirements Collection Template
> ==========================================================
>
> A) Name: unique one-liner name for the proposed use-case

[SchedulerWakeupLatency] Skip energy aware task placement

> B) Target behaviour: one paragraph to describe the wakeup path issue

The search for the most energy-efficient CPU over the Performance
Domains (PDs) by consulting the Energy Model (EM), i.e. the estimation
on how much energy a PD consumes if the task migrates to one of its
CPUs, adds a certain amount of latency to task placement.

For some tasks this extra latency could be too high. A good example here
are the Android display pipeline tasks, UIThread and RenderThread. They
have to be placed on idle CPUs with a faster wakeup mechanism than the
energy aware wakeup path (A) to guarantee the smallest amount of dropped
or delayed frames (a.k.a. jank).

In Linux kernel mainline there is currently no mechanism for latency
sensitive tasks to allow that the energy aware wakeup path (A) is
skipped and the fast path (B) taken instead.

> C) Existing control paths: reference to code paths to justify B)

select_task_rq_fair()
{
...

if (wakeup)
if (asym_cpucapacity && EM && schedutil)
new_cpu = find_energy_efficient_cpu(); <- (A)
if (new_cpu >= 0)
return new_cpu;

...

if (unlikely(sd))
/* slow path */
else if (sd_flag & wakeup)
/* fast path */
new_cpu = select_idle_sibling() {
if (asym_cpucapacity)
new_cpu = select_idle_capacity(); <- (B)
if (new_cpu >= 0)
return new_cpu;
}

...

return new_cpu;
}

> D) Desired behaviour: one paragraph to describe the desired update

A mechanism for a task to skip the energy aware wakeup (A) and fallback
into the fast path (B).

> E) Existing knobs (if any): reference to whatever existing tunable

There are no existing ways to control this behaviour in Linux kernel
mainline.

There is the concept of 'prefer idle' in Android which is tightly
coupled with the proprietary cgroup controller schedtune.

> F) Existing knobs (if any): one paragraph description of the limitations

Schedtune will be replaced by mainline uclamp in upcoming Android
releases. There is no per-task 'prefer idle' interface.

> G) Proportionality Analysis: check the nature of the target behavior

The use case requires that a task either cares about latency or not.

> H) Range Analysis: identify meaningful ranges

The knob can be defined as latency sensitive (i.e. prefer an idle CPU)
or as not latency sensitive.

Mapping Analysis:

If required by other use-cases, the binary range requirement can easily
be covered by a wider, more fine grained latency sensitive range.

> I) System-Wide tuning: which knobs are required

No system-wide tuning required.

> J) Per-Task tuning: which knobs are required

The proposal is a per-task flag, indicating whether the task is latency
sensitive or not.

> K) Task-Group tuning: which knobs are required

Currently Android uses the 'prefer idle' mechanism only on task-groups
and not on individual tasks.

Therefore a per task-group implementation would be required. The
implementation should respect the cgroup resource distribution models
[1], [2].

> .:: References
> ==============

[1] LWN: The many faces of "latency nice"
https://lwn.net/Articles/820659

[2] Control Group v2: Resource Distribution Models
https://www.kernel.org/doc/Documentation/admin-guide/cgroup-v2.rst