Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch

From: zhidao su

Date: Sun Apr 05 2026 - 04:37:34 EST


On Sat, Apr 04, 2026 at 12:22:44PM +0200, Peter Zijlstra wrote:
> Random brain wave...
>
> Since the dl_server is LLF (deferred), it will pretty much always trip
> the dl_entity_overflow() when interrupted, right? Does it make sense to
> use the revised wake-up rule for it, when appropriate?

Thanks for the brain wave!

Tested your diff — locktorture boot time drops to ~13s (vs ~37-52s with
the hack revert) and ksched_football ball_pos stays at 0.

I traced update_dl_entity() and found the else-branch hits all show
dl_defer_running=1 with dl_throttled=0 and dl_defer_armed=0 — that's
the [D:running] state, so the guard there is correct. The actual stale
case is in the if-branch (overflow=1, deadline not past, dl_defer_running=1),
which your diff handles via revised wakeup.

That also means our original else-branch fix was wrong — unconditionally
clearing dl_defer_running in [D:running] would corrupt a legitimately
running server's state.

Is your revised wakeup diff the intended replacement for 115135422562?
If so, happy to test further or help draft it into a proper patch.

Signed-off-by: zhidao su <suzhidao@xxxxxxxxxx>