Re: [PATCH v2 3/2] sched/deadline: Check bandwidth overflow earlier for hotplug

From: Jon Hunter
Date: Fri Jan 10 2025 - 13:42:15 EST


Hi Juri,

On 10/01/2025 15:45, Juri Lelli wrote:
Hi Jon,

On 10/01/25 11:52, Jon Hunter wrote:
Hi Juri,


...

I have noticed a suspend regression on one of our Tegra boards and bisect is
pointing to this commit. If I revert this on top of -next then I don't see
the issue.

The only messages I see when suspend fails are ...

[ 53.905976] Error taking CPU1 down: -16
[ 53.909887] Non-boot CPUs are not disabled

So far this is only happening on Tegra186 (ARM64). Let me know if you have
any thoughts.

Are you running any DEADLINE task in your configuration?

Not that I am aware of.

In any case, could you please repro with the following (as a start)?
It should print additional debugging info on the console.

Thanks!
Juri

---
kernel/sched/deadline.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 62192ac79c30..77736bab1992 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -3530,6 +3530,7 @@ static int dl_bw_manage(enum dl_bw_request req, int cpu, u64 dl_bw)
* dl_servers we can discount, as tasks will be moved out the
* offlined CPUs anyway.
*/
+ printk_deferred("%s: cpu=%d cap=%lu fair_server_bw=%llu total_bw=%llu dl_bw_cpus=%d\n", __func__, cpu, cap, fair_server_bw, dl_b->total_bw, dl_bw_cpus(cpu));
if (dl_b->total_bw - fair_server_bw > 0) {
/*
* Leaving at least one CPU for DEADLINE tasks seems a


With the above I see the following ...

[ 53.919672] dl_bw_manage: cpu=5 cap=3072 fair_server_bw=52428 total_bw=209712 dl_bw_cpus=4
[ 53.930608] dl_bw_manage: cpu=4 cap=2048 fair_server_bw=52428 total_bw=157284 dl_bw_cpus=3
[ 53.941601] dl_bw_manage: cpu=3 cap=1024 fair_server_bw=52428 total_bw=104856 dl_bw_cpus=2
[ 53.952186] dl_bw_manage: cpu=2 cap=1024 fair_server_bw=52428 total_bw=576708 dl_bw_cpus=2
[ 53.962938] dl_bw_manage: cpu=1 cap=0 fair_server_bw=52428 total_bw=576708 dl_bw_cpus=1
[ 53.971068] Error taking CPU1 down: -16
[ 53.974912] Non-boot CPUs are not disabled

Thanks
Jon

--
nvpublic