Re: [PATCH v2 3/2] sched/deadline: Check bandwidth overflow earlier for hotplug

From: Juri Lelli
Date: Fri Jan 10 2025 - 10:45:43 EST


Hi Jon,

On 10/01/25 11:52, Jon Hunter wrote:
> Hi Juri,
>

...

> I have noticed a suspend regression on one of our Tegra boards and bisect is
> pointing to this commit. If I revert this on top of -next then I don't see
> the issue.
>
> The only messages I see when suspend fails are ...
>
> [ 53.905976] Error taking CPU1 down: -16
> [ 53.909887] Non-boot CPUs are not disabled
>
> So far this is only happening on Tegra186 (ARM64). Let me know if you have
> any thoughts.

Are you running any DEADLINE task in your configuration?

In any case, could you please repro with the following (as a start)?
It should print additional debugging info on the console.

Thanks!
Juri

---
kernel/sched/deadline.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 62192ac79c30..77736bab1992 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -3530,6 +3530,7 @@ static int dl_bw_manage(enum dl_bw_request req, int cpu, u64 dl_bw)
* dl_servers we can discount, as tasks will be moved out the
* offlined CPUs anyway.
*/
+ printk_deferred("%s: cpu=%d cap=%lu fair_server_bw=%llu total_bw=%llu dl_bw_cpus=%d\n", __func__, cpu, cap, fair_server_bw, dl_b->total_bw, dl_bw_cpus(cpu));
if (dl_b->total_bw - fair_server_bw > 0) {
/*
* Leaving at least one CPU for DEADLINE tasks seems a