Re: [PATCH v2 3/2] sched/deadline: Check bandwidth overflow earlier for hotplug

From: Christian Loehle
Date: Thu Feb 13 2025 - 10:16:46 EST


On 2/13/25 14:51, Juri Lelli wrote:
> On 13/02/25 13:38, Christian Loehle wrote:
>> On 2/13/25 13:33, Juri Lelli wrote:
>
> ...
>
>>> Not sure I get what your worry is, sorry. In my understanding when the
>>> last cpu of a policy/cluster gets offlined the corresponding sugov
>>> kthread gets stopped as well (sugov_exit)?
>>>
>>
>> The other way round.
>> We may have sugov kthread of cluster [6,7] affined to CPU1. Is it
>> guaranteed that we cannot offline CPU1 (while CPU6 or CPU7 are still
>> online)?
>
> Uhu, is this a sane/desired setup? Anyway, I would say that if CPU1 is
> offlined sugov[6,7] will need to be migrated someplace else.

Sane? I guess that's to be discussed. It is definitely desirable
unfortunately.
As mentioned I experimented with having sugov DL tasks (as they cause
a lot of idle wakeups (which are expensive on the bigger CPUs)) both
always run locally and never IPI (but that means we have contention and
still run a double switch on an 'expensive' CPU) and run that on a little
CPU and the latter had much better results.

>
>> Or without the affinity:
>> cluster [6,7] with isolcpu=6 (i.e. sugov kthread of that cluster can
>> only run on CPU7). Is offlining of CPU6 then prevented (as long as
>> CPU7 is online)?
>> I don't see how.
>> Anyway we probably want to change isolcpu and affinity to merely be
>> a suggestion for the sugov DL case. Fundamentally it belongs to what
>> is run on that CPU anyway.
>
> I would tend to agree.

I'll write something up.