Re: [PATCH v2 3/2] sched/deadline: Check bandwidth overflow earlier for hotplug

From: Juri Lelli
Date: Wed Feb 05 2025 - 01:54:11 EST


On 03/02/25 11:01, Jon Hunter wrote:
> Hi Juri,
>
> On 16/01/2025 15:55, Juri Lelli wrote:
> > On 16/01/25 13:14, Jon Hunter wrote:

...

> > > [ 210.595431] dl_bw_manage: cpu=5 cap=3072 fair_server_bw=52428 total_bw=209712 dl_bw_cpus=4
> > > [ 210.606269] dl_bw_manage: cpu=4 cap=2048 fair_server_bw=52428 total_bw=157284 dl_bw_cpus=3
> > > [ 210.617281] dl_bw_manage: cpu=3 cap=1024 fair_server_bw=52428 total_bw=104856 dl_bw_cpus=2
> > > [ 210.627205] dl_bw_manage: cpu=2 cap=1024 fair_server_bw=52428 total_bw=262140 dl_bw_cpus=2
> > > [ 210.637752] dl_bw_manage: cpu=1 cap=0 fair_server_bw=52428 total_bw=262140 dl_bw_cpus=1
> > ^
> > Different than before but still not what I expected. Looks like there
> > are conditions/path I currently cannot replicate on my setup, so more
> > thinking. Unfortunately I will be out traveling next week, so this
> > might required a bit of time.
>
>
> I see that this is now in the mainline and our board is still failing to
> suspend. Let me know if there is anything else you need me to test.

Ah, can you actually add 'sched_verbose' and to your kernel cmdline? It
should print our additional debug info on the console when domains get
reconfigured by hotplug/suspends, e.g.

dl_bw_manage: cpu=3 cap=3072 fair_server_bw=52428 total_bw=209712 dl_bw_cpus=4
CPU0 attaching NULL sched-domain.
CPU3 attaching NULL sched-domain.
CPU4 attaching NULL sched-domain.
CPU5 attaching NULL sched-domain.
CPU0 attaching sched-domain(s):
domain-0: span=0,4-5 level=MC
groups: 0:{ span=0 cap=766 }, 4:{ span=4 cap=908 }, 5:{ span=5 cap=989 }
CPU4 attaching sched-domain(s):
domain-0: span=0,4-5 level=MC
groups: 4:{ span=4 cap=908 }, 5:{ span=5 cap=989 }, 0:{ span=0 cap=766 }
CPU5 attaching sched-domain(s):
domain-0: span=0,4-5 level=MC
groups: 5:{ span=5 cap=989 }, 0:{ span=0 cap=766 }, 4:{ span=4 cap=908 }
root domain span: 0,4-5
rd 0,4-5: Checking EAS, CPUs do not have asymmetric capacities
psci: CPU3 killed (polled 0 ms)

Can you please share this information as well if you are able to collect
it (while still running with my last proposed fix)?

Thanks!
Juri