Re: [PATCH v2] sched/deadline: Reset dl_server execution state on stop

From: Gabriele Monaco

Date: Wed Jan 28 2026 - 04:50:49 EST


2026-01-27T18:55:02Z Andrea Righi <arighi@xxxxxxxxxx>:
> Unfortunately checking only runtime <= 0 isn't enough for the sched_ext DL
> server case:
>
> # Runtime of EXT task (PID 2025) is 0.000000 seconds
> # Runtime of RT task (PID 2026) is 4.990000 seconds
> # EXT task got 0.00% of total runtime
> not ok 2 FAIL: EXT task got less than 4.00% of runtime
>
> With the unconditional reset the EXT task gets 5% of the bandwidth. I'll
> add some debugging to figure out exactly what is happening.

Thanks for testing it. That's quite strange..

I run your test on a kernel without ext server, as far as I understand, the test is kinda indirectly checking also the fair server and that does not fail, right?
At least that's what I get on an arm64 machine with 128 CPUs.

After letting the test continue on failure I get:

# # Runtime of FAIR task (PID 22503) is 0.240000 seconds
# # Runtime of RT task (PID 22504) is 4.750000 seconds
# # FAIR task got 4.81% of total runtime
# ok 1 PASS: FAIR task got more than 4.00% of runtime
# TAP version 13
# 1..1
# # Runtime of EXT task (PID 22511) is 0.020000 seconds
# # Runtime of RT task (PID 22512) is 4.970000 seconds
# # EXT task got 0.40% of total runtime
# not ok 2 FAIL: EXT task got less than 4.00% of runtime
# TAP version 13
# 1..1
# # Runtime of FAIR task (PID 22518) is 0.240000 seconds
# # Runtime of RT task (PID 22519) is 4.750000 seconds
# # FAIR task got 4.81% of total runtime
# ok 3 PASS: FAIR task got more than 4.00% of runtime
# TAP version 13
# 1..1
# # Runtime of EXT task (PID 22525) is 0.000000 seconds
# # Runtime of RT task (PID 22526) is 4.990000 seconds
# # EXT task got 0.00% of total runtime
# not ok 4 FAIL: EXT task got less than 4.00% of runtime
# ok 24 rt_stall #

Mind that it's expected for the ext task to starve (I didn't apply the patches enabling the server).

After adding all your patches [1], also the ext passes the test (i.e. gets boosted just fine).

I tried disabling all CPUs but CPU0 and run the same test and it hung (bad sign), then I also enabled CPU1 (total 2 CPUs online) and again I see both fair and ext getting their share.

What am I missing here?

Thanks,
Gabriele

[1] - https://lore.kernel.org/lkml/20260126100050.3854740-1-arighi@xxxxxxxxxx