Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
From: zhidao su
Date: Sun Apr 05 2026 - 04:37:34 EST
On Sat, Apr 04, 2026 at 12:22:44PM +0200, Peter Zijlstra wrote:
> Random brain wave...
>
> Since the dl_server is LLF (deferred), it will pretty much always trip
> the dl_entity_overflow() when interrupted, right? Does it make sense to
> use the revised wake-up rule for it, when appropriate?
Thanks for the brain wave!
Tested your diff — locktorture boot time drops to ~13s (vs ~37-52s with
the hack revert) and ksched_football ball_pos stays at 0.
I traced update_dl_entity() and found the else-branch hits all show
dl_defer_running=1 with dl_throttled=0 and dl_defer_armed=0 — that's
the [D:running] state, so the guard there is correct. The actual stale
case is in the if-branch (overflow=1, deadline not past, dl_defer_running=1),
which your diff handles via revised wakeup.
That also means our original else-branch fix was wrong — unconditionally
clearing dl_defer_running in [D:running] would corrupt a legitimately
running server's state.
Is your revised wakeup diff the intended replacement for 115135422562?
If so, happy to test further or help draft it into a proper patch.
Signed-off-by: zhidao su <suzhidao@xxxxxxxxxx>