[RFC PATCH 0/3] sched/deadline: cpuset: Rework DEADLINE bandwidth restoration

From: Juri Lelli
Date: Wed Mar 15 2023 - 08:20:41 EST


Qais reported [1] that iterating over all tasks when rebuilding root
domains for finding out which ones are DEADLINE and need their bandwidth
correctly restored on such root domains can be a costly operation (10+
ms delays on suspend-resume). He proposed we skip rebuilding root
domains for certain operations, but that approach seemed arch specific
and possibly prone to errors, as paths that ultimately trigger a rebuild
might be quite convoluted (thanks Qais for spending time on this!).

To fix the problem I instead would propose we

1 - Bring back cpuset_mutex (so that we have write access to cpusets
from scheduler operations - and we also fix some problems
associated to percpu_cpuset_rwsem)
2 - Keep track of the number of DEADLINE tasks belonging to each cpuset
3 - Use this information to only perform the costly iteration if
DEADLINE tasks are actually present in the cpuset for which a
corresponding root domain is being rebuilt

This set is also available from

https://github.com/jlelli/linux.git deadline/rework-cpusets

Feedback is more than welcome.

Best,
Juri

1 - https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@xxxxxxxxxxx/

Juri Lelli (3):
sched/cpuset: Bring back cpuset_mutex
sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
cgroup/cpuset: Iterate only if DEADLINE tasks are present

include/linux/cpuset.h | 12 ++-
kernel/cgroup/cgroup.c | 4 +
kernel/cgroup/cpuset.c | 175 +++++++++++++++++++++++------------------
kernel/sched/core.c | 32 ++++++--
4 files changed, 137 insertions(+), 86 deletions(-)

--
2.39.2