[PATCH v5 0/3] Fix a couple of corner cases in feec() when using uclamp_max

From: Qais Yousef
Date: Sat Sep 16 2023 - 19:44:02 EST


Thanks for all the reviews so far!

Changes in v5:

* Added Reviewed-by Dietmar Eggemann.
* Updated commit messages in patch 1 and 2 as requested by Dietmar.

Changes in v4:

* Added Reviewed-by Vincent Guittot.
* Updated sched_compute_energy_tp() to include max_util and busy_time
as requested by Lukasz.

Changes in v3:

* Fix sign comparison problem in patch 1 (Thanks Vincent!)
* Simplify comparison and remove function in patch 2 (Thanks Dietmar!)

Changes in v2:

* Use long instead of unsigned long to keep the comparison simple
in spite of being inconsistent with how capacity type.
* Fix missing termination parenthesis that caused build error.
* Rebase on latest tip/sched/core and Vincent v5 of Unlink misift patch.

v1 link: https://lore.kernel.org/lkml/20230129161444.1674958-1-qyousef@xxxxxxxxxxx/
v2 link: https://lore.kernel.org/lkml/20230205224318.2035646-1-qyousef@xxxxxxxxxxx/
v3 link: https://lore.kernel.org/lkml/20230717215717.309174-1-qyousef@xxxxxxxxxxx/
v4 link: https://lore.kernel.org/lkml/20230821224504.710576-1-qyousef@xxxxxxxxxxx/

In v2 Dietmar has raised concerns about limitation in current EM calculations
that can end up packing more tasks on a cluster. While this is not ideal
situation and we need to fix it, but it is another independent problem that is
not introduced by this fix. I don't see a reason why we should couple them
rather than work on each problem independently. The packing behavior in
practice is actually not bad as if something is capped really hard, there's
a desire to keep them on this less performant clusters.

Patch 1 addresses a bug because forcing a task on a small CPU to honour
uclamp_max hint means we can end up with spare_capacity = 0; but the logic is
constructed such that spare_capacity = 0 leads to ignoring this CPU as
a candidate to compute_energy().

Patch 2 addresses a bug due to an optimization in feec() that could lead to
ignoring tasks whose uclamp_max = 0 but task_util(0) != 0.

Patch 3 adds a new tracepoint in compute_energy() as it was helpful in
debugging these two problems.

This is based on tip/sched/core.

Qais Yousef (3):
sched/uclamp: Set max_spare_cap_cpu even if max_spare_cap is 0
sched/uclamp: Ignore (util == 0) optimization in feec() when
p_util_max = 0
sched/tp: Add new tracepoint to track compute energy computation

include/trace/events/sched.h | 5 +++++
kernel/sched/core.c | 1 +
kernel/sched/fair.c | 36 ++++++++++++------------------------
3 files changed, 18 insertions(+), 24 deletions(-)

--
2.34.1