Re: [PATCH v3 2/5] sched/deadline: Fix reclaim inaccuracy with SMP

From: luca abeni
Date: Sat May 20 2023 - 05:59:11 EST


Hi again,

On Fri, 19 May 2023 12:12:50 -0400
Vineeth Remanan Pillai <vineeth@xxxxxxxxxxxxxxx> wrote:
[...]
> - cpu util may go to 100% when we have tasks with large bandwidth
> close to Umax
>
> As an eg. for issue 1, three tasks - (7,10) (3,10) and (1,10):
> TID[590]: RECLAIM=1, (r=7ms, d=10ms, p=10ms), Util: 95.20
> TID[591]: RECLAIM=1, (r=3ms, d=10ms, p=10ms), Util: 81.94
> TID[592]: RECLAIM=1, (r=1ms, d=10ms, p=10ms), Util: 27.19
>
> re. issue 2, four tasks with same reservation (7,10), tasks tries
> to reclaim leading to 100% cpu usage on all three cpus and leads to
> system hang.

I just tried to repeat this test on a VM with 3 CPUs, and I can
reproduce the stall (100% of CPU time reclaimed by SCHED_DEADLINE
tasks, with no possibility for the other tasks to execute) when I use
dq = -(max{u_i / Umax, (Umax - Uinact - Uextra)}) * dt

But when I use
dq = -(max{u_i, (Umax - Uinact - Uextra)} / Umax) * dt
everything works as expected, the 4 tasks reclaim 95% of the CPU
time and my shell is still active...
(so, I cannot reproduce the starvation issue with this equation)

So, I now think the second one is the correct equation to be used.



Luca