Re: [PATCH] [RFC]: sched/deadline: Avoid double enqueue_pushable_dl_task() warning

From: Peter Zijlstra

Date: Wed Mar 04 2026 - 04:52:08 EST


On Wed, Mar 04, 2026 at 08:06:53AM +0100, Juri Lelli wrote:
> Hello,
>
> On 03/03/26 19:41, John Stultz wrote:
> > In testing with the full Proxy Execution patch stack, I found
> > I would occasionally trip over the !RB_EMPTY_NODE() WARN_ON in
> > enqueue_pushable_dl_task(), where the task we're adding to the
> > pushable list is already enqueued.
> >
> > This triggers from put_prev_task_dl(), where it seems we go into
> > put_prev_task_dl()
> > -> update_curr_dl()
> > -> update_curr_dl_se() [hitting the dl_runtime_exceeded() case]
> > -> enqueue_task_dl()
> > -> enqueue_pushable_dl_task()
> >
> > Adding the task to the pushable the first time.
>
> Ah, so in case the task is boosted (or we fail to start the
> replenishment timer).
>
> > Then we back up the call stack to put_prev_task_dl(), which at
> > the end again calls enqueue_pushable_dl_task(), trying to add it
> > a second time, tripping the warning.
> >
> > To avoid this, add a dl_task_pushable() helper which we can use
> > to replace the RB_EMPTY_NODE checks elsewhere, and then before
> > enqueueing in put_prev_task_dl(), we can first check
> > dl_task_pushable() to avoid the double enqueue.
>
> Can't we just return early (as we do already in dequeue_pushable
> _dl_task()) in enqueue_pushable_dl_task() instead of checking before
> calling that function?

So I was mightily confused for a moment by all this.

But it turns out DL keeps current *in* the tree; since ->deadline is a
lot less mutable than ->vruntime this is possible.

But that also means that set_next_task() / put_prev_task() are
'simpler'.

However, in this case I think they are too simple, and its leading to
problems.

So update_curr_dl() needs to dequeue+enqueue because it is pushing
->deadline (because current is in tree). Then because of that, it also
ends up doing enqueue_pushable, but that is actively wrong, you must not
do that on current.

Only once you're doing put_prev_task() must you do that.

Imagine this happens because of an update_curr() that is not from
put_prev_task(), then you end up with current on the pushable list,
which is a big no-no.

So I think I'd like to see dl_rq->curr tracking, similar to what we have
for fair.

If you do that (see below), you'll probably find this case was already
handled 'correctly' but was broken by the PE thing ;-)

Note the !task_current() clause in enqueue_task_dl() guarding
enqueue_pushable_dl_task().

Hmm?

---
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 9869025941a0..a7db81d17082 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2295,7 +2295,10 @@ static void dequeue_dl_entity(struct sched_dl_entity *dl_se, int flags)

static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
{
- if (is_dl_boosted(&p->dl)) {
+ struct sched_dl_entity *dl_se = &p->dl;
+ struct dl_rq *dl_rq = &rq->dl;
+
+ if (is_dl_boosted(dl_se)) {
/*
* Because of delays in the detection of the overrun of a
* thread's runtime, it might be the case that a thread
@@ -2308,14 +2311,14 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
*
* In this case, the boost overrides the throttle.
*/
- if (p->dl.dl_throttled) {
+ if (dl_se->dl_throttled) {
/*
* The replenish timer needs to be canceled. No
* problem if it fires concurrently: boosted threads
* are ignored in dl_task_timer().
*/
- cancel_replenish_timer(&p->dl);
- p->dl.dl_throttled = 0;
+ cancel_replenish_timer(dl_se);
+ dl_se->dl_throttled = 0;
}
} else if (!dl_prio(p->normal_prio)) {
/*
@@ -2327,7 +2330,7 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
* being boosted again with no means to replenish the runtime and clear
* the throttle.
*/
- p->dl.dl_throttled = 0;
+ dl_se->dl_throttled = 0;
if (!(flags & ENQUEUE_REPLENISH))
printk_deferred_once("sched: DL de-boosted task PID %d: REPLENISH flag missing\n",
task_pid_nr(p));
@@ -2336,20 +2339,23 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
}

check_schedstat_required();
- update_stats_wait_start_dl(dl_rq_of_se(&p->dl), &p->dl);
+ update_stats_wait_start_dl(dl_rq, dl_se);

if (p->on_rq == TASK_ON_RQ_MIGRATING)
flags |= ENQUEUE_MIGRATING;

- enqueue_dl_entity(&p->dl, flags);
+ enqueue_dl_entity(dl_se, flags);

- if (dl_server(&p->dl))
+ if (dl_server(dl_se))
return;

if (task_is_blocked(p))
return;

- if (!task_current(rq, p) && !p->dl.dl_throttled && p->nr_cpus_allowed > 1)
+ if (dl_rq->curr == dl_se)
+ return;
+
+ if (!p->dl.dl_throttled && p->nr_cpus_allowed > 1)
enqueue_pushable_dl_task(rq, p);
}

@@ -2565,6 +2571,10 @@ static void start_hrtick_dl(struct rq *rq, struct sched_dl_entity *dl_se)
}
#endif /* !CONFIG_SCHED_HRTICK */

+/*
+ * DL keeps current in tree, because ->deadline is not typically changed while
+ * a task is runnable.
+ */
static void set_next_task_dl(struct rq *rq, struct task_struct *p, bool first)
{
struct sched_dl_entity *dl_se = &p->dl;
@@ -2585,6 +2595,9 @@ static void set_next_task_dl(struct rq *rq, struct task_struct *p, bool first)

deadline_queue_push_tasks(rq);

+ WARN_ON_ONCE(dl_rq->curr);
+ dl_rq->curr = dl_se;
+
if (hrtick_enabled_dl(rq))
start_hrtick_dl(rq, &p->dl);
}
@@ -2640,17 +2653,20 @@ static void put_prev_task_dl(struct rq *rq, struct task_struct *p, struct task_s
struct sched_dl_entity *dl_se = &p->dl;
struct dl_rq *dl_rq = &rq->dl;

- if (on_dl_rq(&p->dl))
+ if (on_dl_rq(dl_se))
update_stats_wait_start_dl(dl_rq, dl_se);

update_curr_dl(rq);

update_dl_rq_load_avg(rq_clock_pelt(rq), rq, 1);

+ WARN_ON_ONCE(dl_rq->curr != dl_se);
+ dl_rq->curr = NULL;
+
if (task_is_blocked(p))
return;

- if (on_dl_rq(&p->dl) && p->nr_cpus_allowed > 1)
+ if (on_dl_rq(dl_se) && p->nr_cpus_allowed > 1)
enqueue_pushable_dl_task(rq, p);
}

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index fd36ae390520..fb7e6c1f31e2 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -880,6 +880,7 @@ struct dl_rq {

bool overloaded;

+ struct sched_dl_entity *curr;
/*
* Tasks on this rq that can be pushed away. They are kept in
* an rb-tree, ordered by tasks' deadlines, with caching