[PATCH v2] sched/rt: Push RT tasks when preempted by a deadline task

From: yangsonghua

Date: Thu Jun 25 2026 - 23:26:29 EST


Commit 704069649b5b ("sched/core: Rework sched_class::wakeup_preempt()
and rq_modified_*()") made wakeup_preempt_rt() callable for cross-class
wakeups, leaving a "XXX If we're preempted by DL, queue a push?" comment
with no implementation.

When a SCHED_DEADLINE task preempts a running SCHED_FIFO/SCHED_RR task,
the RT class's put_prev_task_rt() is the natural place to trigger a push:
at that point the preempted task is (conditionally) added to
pushable_tasks, and we have 'next' available to identify the reason for
preemption.

The earlier wakeup_preempt_rt() call site is too early: the running RT
task has not yet been added to pushable_tasks, so rt_queue_push_tasks()
would be a no-op for the common single-RT-task-per-CPU case.

Guard the dl_task(next) check with a NULL test. put_prev_task_rt() is
reachable with next == NULL when called via the bare put_prev_task()
wrapper (sched.h), which always passes NULL:

sched_change_begin()
put_prev_task(rq, p) /* next = NULL */
put_prev_task_rt(rq, p, NULL)
dl_task(NULL) -> NULL deref /* BUG in v1 */

The actual DL-preemption path goes through put_prev_set_next_task(),
which supplies the real next task, so the NULL guard does not change
the intended trigger condition.

Regarding potentially unnecessary push/pull cycles for short-lived DL
tasks: this is an inherent property of any asynchronous push mechanism.
rt_queue_push_tasks() enqueues a balance callback rather than migrating
synchronously, so by the time it fires the DL task may have already
yielded; in that case the RT task may need to be pulled back. This
trade-off is acceptable: leaving RT tasks stranded on a busy CPU
while peer CPUs sit idle is worse than an occasional redundant
migration. The same concern applies to the existing triggers of
rt_queue_push_tasks() (e.g. after dequeue_task_rt), so this path is
consistent with the established policy.

Fix this by adding a dl_task(next) check in put_prev_task_rt(), guarded
against NULL. This is intentionally placed outside and after the
enqueue_pushable_task() condition, so that a push is triggered even when
the preempted RT task is pinned (nr_cpus_allowed == 1) or blocked
(proxy-exec): in those cases there may be other migratable RT tasks
already sitting in pushable_tasks.

The task_is_blocked() early-return is folded into the
enqueue_pushable_task() condition to avoid inadvertently suppressing the
new push logic.

Note: this does not cover the DL-server case where a fair-class task
running under DL bandwidth budget displaces an RT task; that is a
separate concern.

Signed-off-by: yangsonghua <yangsonghua@xxxxxxxxxxx>
---
Changes in v2:
- Guard dl_task(next) with a 'next != NULL' check to fix a NULL pointer
dereference when put_prev_task_rt() is called from sched_change_begin()
via the put_prev_task() wrapper, which always passes next = NULL.
- Extend the commit message to explain the NULL call path and address the
reviewer concern about unnecessary push/pull cycles for short-lived DL
tasks.
- Improve wakeup_preempt_rt() comment to explicitly point readers to
put_prev_task_rt() where the DL push is now handled.

kernel/sched/rt.c | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index e474c31d8fe6..ef8cdc4c25d2 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1627,7 +1627,10 @@ static void wakeup_preempt_rt(struct rq *rq, struct task_struct *p, int flags)
struct task_struct *donor = rq->donor;

/*
- * XXX If we're preempted by DL, queue a push?
+ * If the incoming task belongs to a higher-priority class (e.g. DL),
+ * we only handle same-class RT preemption here. The DL case is
+ * handled in put_prev_task_rt() once the current task has been
+ * added to pushable_tasks.
*/
if (p->sched_class != &rt_sched_class)
return;
@@ -1736,14 +1739,25 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p, struct task_s

update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 1);

- if (task_is_blocked(p))
- return;
/*
* The previous task needs to be made eligible for pushing
- * if it is still active
+ * if it is still active and migratable.
*/
- if (on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1)
+ if (!task_is_blocked(p) && on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1)
enqueue_pushable_task(rq, p);
+
+ /*
+ * When a deadline task takes over this CPU, try to push any queued
+ * RT tasks to CPUs running lower-priority work. This is independent
+ * of whether p itself is pushable: even if p is pinned or blocked,
+ * there may be other migratable RT tasks already in pushable_tasks.
+ *
+ * next is NULL when called from put_prev_task() (e.g. sched_change_begin);
+ * guard accordingly. rt_queue_push_tasks() checks has_pushable_tasks()
+ * internally, so this is a no-op if nothing is queued.
+ */
+ if (next && dl_task(next))
+ rt_queue_push_tasks(rq);
}

/* Only try algorithms three times */
--
2.34.1