Re: [PATCH v2 3/2] sched/deadline: Check bandwidth overflow earlier for hotplug

From: Juri Lelli
Date: Tue Jan 14 2025 - 09:02:34 EST


On 14/01/25 13:52, Jon Hunter wrote:
>
> On 13/01/2025 09:32, Juri Lelli wrote:
> > On 10/01/25 18:40, Jon Hunter wrote:
> >
> > ...
> >
> > > With the above I see the following ...
> > >
> > > [ 53.919672] dl_bw_manage: cpu=5 cap=3072 fair_server_bw=52428 total_bw=209712 dl_bw_cpus=4
> > > [ 53.930608] dl_bw_manage: cpu=4 cap=2048 fair_server_bw=52428 total_bw=157284 dl_bw_cpus=3
> > > [ 53.941601] dl_bw_manage: cpu=3 cap=1024 fair_server_bw=52428 total_bw=104856 dl_bw_cpus=2
> >
> > So far so good.
> >
> > > [ 53.952186] dl_bw_manage: cpu=2 cap=1024 fair_server_bw=52428 total_bw=576708 dl_bw_cpus=2
> >
> > But, this above doesn't sound right.
> >
> > > [ 53.962938] dl_bw_manage: cpu=1 cap=0 fair_server_bw=52428 total_bw=576708 dl_bw_cpus=1
> > > [ 53.971068] Error taking CPU1 down: -16
> > > [ 53.974912] Non-boot CPUs are not disabled
> >
> > What is the topology of your board?
> >
> > Are you using any cpuset configuration for partitioning CPUs?
>
>
> I just noticed that by default we do boot this board with 'isolcpus=1-2'. I
> see that this is a deprecated cmdline argument now and I must admit I don't
> know the history of this for this specific board. It is quite old now.
>
> Thierry, I am curious if you have this set for Tegra186 or not? Looks like
> our BSP (r35 based) sets this by default.
>
> I did try removing this and that does appear to fix it.

OK, good.

> Juri, let me know your thoughts.

Thanks for the additional info. I guess I could now try to repro using
isolcpus at boot on systems I have access to (to possibly understand
what the underlying problem is).

Best,
Juri