Re: [RFC PATCH 00/22] sched/fair: Defer CFS throttling to exit to user mode
From: Josh Don
Date: Thu Feb 20 2025 - 20:47:21 EST
On Thu, Feb 20, 2025 at 7:40 AM Valentin Schneider <vschneid@xxxxxxxxxx> wrote:
...
> As pointed by Ben in [1], the issue with the per-task approach is the
> scalability of the unthrottle. You have the rq lock held and you
> potentially end up unthrottling a deep cgroup hierarchy, putting each
> individual task back on its cfs_rq.
>
> I can't find my notes on that in a hurry, but my idea with that for a next
> version was to periodically release the rq lock as we go up the cgroup
> hierarchy during unthrottle - the idea being that we can mess with part of
> hierarchy, and as long as that part isn't connected to the rest (i.e. it's
> not enqueued, like we currently do for CFS throttling), "it should be
> safe".
Can you elaborate a bit more? Even if we periodically release the
lock, we're still spending quite a long time in non-preemptible kernel
context, and unthrottle is also driven by an hrtimer. So we can still
do quite a lot of damage depending on how long the whole loop takes.