Re: [RFC PATCH 00/22] sched/fair: Defer CFS throttling to exit to user mode
From: Valentin Schneider
Date: Tue Feb 25 2025 - 16:13:36 EST
On 20/02/25 17:47, Josh Don wrote:
> On Thu, Feb 20, 2025 at 7:40 AM Valentin Schneider <vschneid@xxxxxxxxxx> wrote:
> ...
>> As pointed by Ben in [1], the issue with the per-task approach is the
>> scalability of the unthrottle. You have the rq lock held and you
>> potentially end up unthrottling a deep cgroup hierarchy, putting each
>> individual task back on its cfs_rq.
>>
>> I can't find my notes on that in a hurry, but my idea with that for a next
>> version was to periodically release the rq lock as we go up the cgroup
>> hierarchy during unthrottle - the idea being that we can mess with part of
>> hierarchy, and as long as that part isn't connected to the rest (i.e. it's
>> not enqueued, like we currently do for CFS throttling), "it should be
>> safe".
>
> Can you elaborate a bit more? Even if we periodically release the
> lock, we're still spending quite a long time in non-preemptible kernel
> context, and unthrottle is also driven by an hrtimer. So we can still
> do quite a lot of damage depending on how long the whole loop takes.
Indeed, this only gives the rq lock a breather, but it doesn't help with
preempt / irq off.