Re: [RFC PATCH] sched/deadline: Avoid dl_server boosting with expired deadline
From: Gabriele Monaco
Date: Sat Nov 01 2025 - 04:44:05 EST
2025-11-01T00:08:37Z Peter Zijlstra <peterz@xxxxxxxxxxxxx>:
> On Fri, Oct 31, 2025 at 04:41:22PM +0100, Gabriele Monaco wrote:
>> On Fri, 2025-10-31 at 16:20 +0100, Peter Zijlstra wrote:
>>> On Fri, Oct 31, 2025 at 02:24:17PM +0100, Gabriele Monaco wrote:
>>>>
>>>> Different scenario if I have the CPU busy with other tasks (e.g. RT
>>>> policies), there I can see the server stopping and starting again.
>>>> After I do this I seem to get a different behaviour (even some boosting
>>>> after idle), I'm trying to understand what's going on.
>>>>
>>
>> After running some heavy RT workload (stress-ng --cpu 10 --sched rr) I do see
>> the server stopping and starting as the models would expect, but somehow it's
>> always boosting as soon as it's started.
>>
>> Apparently dl_defer_running is always 1 in that scenario. Perhaps running idle
>> counts as running something too, so it never defers. But I can't really see how
>> this happens..
>
> The transition [4], will retain dl_defer_running, such that a timely
> re-start of the dl_server can immediately run again.
Alright I worded it poorly. As far as I understand, what you mentioned is desired behaviour when handling starvation. We don't defer and start the next period boosting.
What I was observing was the server staying running indefinitely.
I run a test with 5s of RR stress-ng and 30s of mostly idle DL workload on a clean VM. I expect boosting only during the first 5 seconds, but I see it also after, where there was clearly no starvation (system was idle, probably a bit hard to see from the trace I shared).
Thanks for the updated patch, I'll try that and see how it goes.
Gabriele