[PATCH v8 0/4] sched: Don't trigger misfit if affinity is restricted

From: Qais Yousef
Date: Sat Mar 23 2024 - 20:46:26 EST


There was a discussion on handling hotplug operation removing a capacity level
and lead to unnecessary misfit lb to trigger again. I opted not to handle it
now, but a working patch is available in [1]. I don't feel strongly about it
and would leave it up to the maintainers to push which direction they prefer.
Patch 4 will make sure that balance interval and nr_failed won't grow
unnecessarily due to bad unnecessary misfit lb. It will lead to some
sub-optimality, but no incorrect behavior.

After 6.9 merge window, dynamic Energy Model series would be merged and it can
lead to the capacities of the CPUs being changed at runtime. This means I need
to post follow up patch to handle this situation to ensure max_allowed_capacity
is correct after an EM update. It might make then handling of hotplug operation
attractive too as there would be some common shared ground.

[1] https://lore.kernel.org/lkml/20240321122039.7gk2mc3syvkrvhjz@airbuntu/

Changes since v7:

* Remove sd arg from check_misfit_status()
* Update typo in commit message in patch 2.
* Add Reviewed-by from Vincent

Changes since v6:

* Simplify update_misfit_status

Changes since v5:

* Remove redundant check to rq->rd->max_cpu_capacity
* Simplify check_misfit_status() further by removing unnecessary checks.
* Add new patch to remove no longer used rd->max_cpu_capacity
* Add new patch to prevent misfit lb from polluting balance_interval
and nr_balance_failed

Changes since v4:

* Store max_allowed_capacity in task_struct and populate it when
affinity changes to avoid iterating through the capacities list in the
fast path (Vincent)
* Use rq->rd->max_cpu_capacity which is updated after hotplug
operations to check biggest allowed capacity in the system.
* Undo the change to check_misfit_status() and improve the function to
avoid similar confusion in the future.
* Split the patches differently. Export the capacity list and sort it
is now patch 1, handling of affinity for misfit detection is patch 2.

Changes since v3:

* Update commit message of patch 2 to be less verbose

Changes since v2:

* Convert access of asym_cap_list to be rcu protected
* Add new patch to sort the list in descending order
* Move some declarations inside affinity check block
* Remove now redundant check against max_cpu_capacity in check_misfit_status()

Changes since v1:

* Use asym_cap_list (thanks Dietmar) to iterate instead of iterating
through every cpu which Vincent was concerned about.
* Use uclamped util to compare with capacity instead of util_fits_cpu()
when iterating through capcities (Dietmar).
* Update commit log with test results to better demonstrate the problem

v1 discussion: https://lore.kernel.org/lkml/20230820203429.568884-1-qyousef@xxxxxxxxxxx/
v2 discussion: https://lore.kernel.org/lkml/20231212154056.626978-1-qyousef@xxxxxxxxxxx/
v3 discussion: https://lore.kernel.org/lkml/20231231175218.510721-1-qyousef@xxxxxxxxxxx/
v4 discussion: https://lore.kernel.org/lkml/20240105222014.1025040-1-qyousef@xxxxxxxxxxx/
v5 discussion: https://lore.kernel.org/lkml/20240205021123.2225933-1-qyousef@xxxxxxxxxxx/
v6, v7 discussion: https://lore.kernel.org/lkml/20240220225622.2626569-1-qyousef@xxxxxxxxxxx/

Thanks!

--
Qais Yousef

Qais Yousef (4):
sched/topology: Export asym_capacity_list
sched/fair: Check a task has a fitting cpu when updating misfit
sched/topology: Remove max_cpu_capacity from root_domain
sched/fair: Don't double balance_interval for migrate_misfit

include/linux/sched.h | 1 +
init/init_task.c | 1 +
kernel/sched/fair.c | 79 +++++++++++++++++++++++++++++++----------
kernel/sched/sched.h | 16 +++++++--
kernel/sched/topology.c | 56 ++++++++++++++---------------
5 files changed, 104 insertions(+), 49 deletions(-)

--
2.34.1