[PATCHSET v2 sched_ext/for-7.2] sched_ext: Auto-manage ext/fair dl_server bandwidth
From: Andrea Righi
Date: Tue May 26 2026 - 04:30:20 EST
Currently, a fixed bandwidth is reserved at boot for both the fair and ext
deadline servers, and this reservation remains unchanged unless explicitly
modified via debugfs. As a result, both servers permanently contribute to global
bandwidth accounting, regardless of whether a BPF scheduler is active.
While unused bandwidth can still be reclaimed at runtime by other classes, this
static reservation prevents RT from fully utilizing available headroom in
situations where one of the sched_ext or fair class is guaranteed to be inactive
(for example, when no BPF scheduler is loaded, or when sched_ext runs in full
mode and replaces fair).
As discussed at the VIII OSPM summit in Cambridge [1], a better solution would
be to dynamically register and unregister deadline server bandwidth based on the
active sched_ext state. This allows the kernel to automatically enable bandwidth
accounting only for the scheduling class that is currently active, while
disabling it for inactive ones.
This patch series implements this automatic register/unregister logic. The
sched_ext total_bw kselftest is also modified to validate the correct behavior
across the different scheduling configurations and ensure that bandwidth
accounting follows the expected state transitions.
[1] https://retis.santannapisa.it/ospm-summit/
Git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arighi/linux.git dl-server-bw-v2
Changes in v2:
- Rework the sched_ext enable path as suggested by Peter: attach ext_server
before committing the scheduler switch and fail the enable if admission
control rejects the reservation; detach fair_server only after a successful
full-mode switch.
- Added dl_server_swap_bw() for the disable/recovery path so ext_server detach
and fair_server reattach happen under the same dl_b->lock, closing the
window where concurrent SCHED_DEADLINE admission could steal the freed
bandwidth (reported by Sashiko).
- Fixed the attach/detach accounting issue reported by Sashiko by updating
rq->dl.this_bw together with root-domain total_bw, draining active or
non-contending servers before detach and preventing detached servers from
starting.
- Reuse dl_rq_change_utilization() to drain the server, so the detach path goes
through the same machinery as dl_server_apply_params()
- Made root-domain accounting honor the same cpu_active() conditions used by
root-domain rebuilds, while preserving runtime/period updates made while a
server is detached.
- Fixed the total_bw selftest issues reported by Sashiko: check fclose()
errors for debugfs writes, preserve per-CPU fair_server runtime values, and
restore all CPUs on cleanup even if one write fails.
- Link to v1: https://lore.kernel.org/all/20260521174509.1534623-1-arighi@xxxxxxxxxx/
Andrea Righi (2):
sched_ext: Auto-register/unregister dl_server reservations
selftests/sched_ext: Validate dl_server attach/detach in total_bw test
include/linux/sched.h | 6 +
kernel/sched/deadline.c | 207 +++++++++++++++++++++++++--
kernel/sched/ext.c | 71 +++++++++
kernel/sched/sched.h | 4 +
tools/testing/selftests/sched_ext/total_bw.c | 201 +++++++++++++++++++++++++-
5 files changed, 480 insertions(+), 9 deletions(-)