[PATCHv2 1/2] sched/deadline: Skip the deadline bandwidth check if kexec_in_progress

From: Pingfan Liu

Date: Mon Oct 27 2025 - 23:09:54 EST


During discussion of the scheduler deadline bug [1], Pierre Gondois
pointed out a potential issue during kexec: as CPUs are unplugged, the
available DL bandwidth of the root domain gradually decreases. At some
point, insufficient bandwidth triggers an overflow detection, causing
CPU hot-removal to fail and kexec to hang [2].

This can be reproduced by:
chrt -d -T 1000000 -P 1000000 0 yes > /dev/null &
kexec -e

Meeting deadline bandwidth requirements is unnecessary during the kexec
process. Skip DL bandwidth validation to allow kexec to proceed smoothly.

[1]: https://lore.kernel.org/all/20250929133602.32462-1-piliu@xxxxxxxxxx/
[2]: https://lore.kernel.org/all/3408aca5-e6c9-434a-9950-82e9147fcbba@xxxxxxx/

Reported-by: Pierre Gondois <pierre.gondois@xxxxxxx>
Closes: https://lore.kernel.org/all/3408aca5-e6c9-434a-9950-82e9147fcbba@xxxxxxx/
Signed-off-by: Pingfan Liu <piliu@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: Pierre Gondois <pierre.gondois@xxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Baoquan He <bhe@xxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Valentin Schneider <vschneid@xxxxxxxxxx>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@xxxxxxxxx>
Cc: Joel Granados <joel.granados@xxxxxxxxxx>
To: kexec@xxxxxxxxxxxxxxxxxxx
To: linux-kernel@xxxxxxxxxxxxxxx
---
kernel/kexec_core.c | 6 ++++++
kernel/sched/deadline.c | 7 +++++++
2 files changed, 13 insertions(+)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 31203f0bacafa..265de9d1ff5f5 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1183,7 +1183,13 @@ int kernel_kexec(void)
} else
#endif
{
+ /*
+ * CPU hot-removal path refers to kexec_in_progress, it
+ * requires a sync to ensure no in-flight hot-removing.
+ */
+ cpu_hotplug_disable();
kexec_in_progress = true;
+ cpu_hotplug_enable();
kernel_restart_prepare("kexec reboot");
migrate_to_reboot_cpu();
syscore_shutdown();
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 72c1f72463c75..9db6f26b6cc81 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -18,6 +18,7 @@

#include <linux/cpuset.h>
#include <linux/sched/clock.h>
+#include <linux/kexec.h>
#include <uapi/linux/sched/types.h>
#include "sched.h"
#include "pelt.h"
@@ -3484,6 +3485,12 @@ static int dl_bw_manage(enum dl_bw_request req, int cpu, u64 dl_bw)

int dl_bw_deactivate(int cpu)
{
+ /*
+ * The system is shutting down and cannot roll back. There is no point
+ * in keeping track of bandwidth, which may fail hotplug.
+ */
+ if (unlikely(kexec_in_progress))
+ return 0;
return dl_bw_manage(dl_bw_req_deactivate, cpu, 0);
}

--
2.49.0