Re: sched/deadline: Use revised wakeup rule for dl_server

From: Andreas Ziegler

Date: Tue May 26 2026 - 00:11:18 EST

On 2026-05-25 07:25, Christian Loehle wrote:

On 5/11/26 10:47, Christian Loehle wrote:

On 5/9/26 12:42, Andreas Ziegler wrote:

Hi Christian, Everyone,

On 2026-05-08 14:13, Christian Loehle wrote:

On 5/8/26 13:06, Andreas Ziegler wrote:

Hi Christian,

On 2026-05-08 09:20, Christian Loehle wrote:

On 5/8/26 09:09, Andreas Ziegler wrote:

Linux kernel version: 6.12
CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied)
Architecture: aarch64
Platform: Raspberry Pi 4

Hi everyone,

Commit d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) [1] introduced a marked degradation in scheduling latency for real-time tasks in the presence of heavy I/O load.

--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1079,7 +1079,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
     if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
         dl_entity_overflow(dl_se, rq_clock(rq))) {

-        if (unlikely(!dl_is_implicit(dl_se) &&
+        if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) &&
                  !dl_time_before(dl_se->deadline, rq_clock(rq)) &&
                  !is_dl_boosted(dl_se))) {
             update_dl_revised_wakeup(dl_se, rq);

This was observed using a modified version of Con Kolivas' interactivity benchmark [2]; kernel bisection eventually pointed to the above mentioned commit.

Benchmark results before d66792919d4f:

--- Benchmarking simulated cpu of Audio real time in the presence of simulated ---
Load    Latency +/- SD   median max [100n]    Desired CPU Deadlines met [%]
None      76.6 +/- 8.3654    76 166
Video      78.5 +/- 3.9433    78 107
X      76.4 +/- 8.123     75 157
Burn      72.0 +/- 6.4733    71 127
Write     255.3 +/- 26.627   252 331
Read     226.6 +/- 12.38    227 262
Ring      84.2 +/- 6.6207    83 125
Compile     225.3 +/- 23.949   222 328

     136.8 +/- 78.462        331

Benchmark results after d66792919d4f:

--- Benchmarking simulated cpu of Audio real time in the presence of simulated ---
Load    Latency +/- SD   median max [100n]    Desired CPU Deadlines met [%]
None      68.4 +/- 9.7864    67 169
Video      74.4 +/- 3.724     74   97
X      72.0 +/- 6.5681    71 129
Burn      66.9 +/- 5.9059    66 117
Write    9576.9 +/- 67639    250500418        98.1         98.1
Read     209.3 +/- 11.018   209 267
Ring      80.5 +/- 8.0993    78 125
Compile     239.0 +/- 29.447   234 372

    1298.4 +/- 24118       500418

Reverting this commit obviously solves the issue for me. I have no idea why this issue appears exclusively with heavy write loads in the background.

Is this a scheduler issue, or rather something in the background?

Hi Andreas,
You're using cpufreq schedutil for your tests I'm assuming?
Is there a difference in cpufreq behavior (avg cpufreq or OPP residencies?)
Does the regression also happen on powersave/performance governor?

Actually this is a very stripped-down system. The 'performance' cpufreq governor is the only one compiled in, the processor cores run on a fixed frequency. CONFIG_PM_OPP is not set.

That certainly makes the analysis easier.
I couldn't reproduce the issue so far on my system but it does seem like the dl server
would get potentially unbounded running time with very frequent
starting and stopping of the dlserver (which presumably happens because of
the writeback) reset the runtime, which then leads to your 25s observed latency.
Peter, how is the revised wakeup rule supposed to behave here?

[snip]

This seems to be a case of runtime starvation. If I change sched_rt_runtime_us to a smaller value, the benchmark returns reasonable latency values.

# echo "980000" > /proc/sys/kernel/sched_rt_runtime_us

I could live with this workaround, since it seems not to impact overall latency values in a noticeable way.

Not a very stable workaround unfortunately :/
While I try to reproduce this, what you're observing should imply that the
background SCHED_NORMAL work is enough to fully utilize the system, right?
interbench Write does 4k (buffered) writes of a 1GB file and then close+open
and repeat, nothing fancy really. Does this actually produce significant CPU
utilization for you? Can you just run the background work and see what that
looks like?
(What you're seeing looks like a bug in any case, just so I'm not going down
a wrong path when trying to reproduce here).

I'd be interested if you can still reproduce with this fix:
https://lore.kernel.org/lkml/20260522125833.264145-1-gmonaco@xxxxxxxxxx/

The current 6.12.91 kernel with this patch applied shows normal latencies without 50ms outliers.

There is a backport in planning for three commits that preceded d66792919d4f, that also fixes the issue: https://marc.info/?l=linux-rt-users&m=177948576595996&w=2 -> https://lore.kernel.org/stable/20260522213120.1205100-1-lbckmnn@xxxxxxxxxxx

Applying the patch above on top of the proposed backport also tested without issues.