Re: [PATCH] sched/cgroup: Lock optimize for cgroup cpu throttle

From: Xin Zhao
Date: Wed Aug 20 2025 - 04:06:32 EST


On Tue, 19 Aug 2025 13:28:52 +0200 Valentin wrote:

> What about using task_work_add() and throttling the task on its way to
> userland? The callback will be invoked without any locks held.


> > In addition to the information in my previous response to Sebastian, I would
> > like to add the following point as a reason for my self-recommendation (to
> > explore my patch for solving the cgroup performance issue in RT-Linux):
> > RT-Linux is a system that places a high emphasis on real-time performance.
> > The fact that regular tasks are also included in cgroup groups and throttled
> > suggests that they are relatively low-priority tasks that are not expected to
> > interfere with high-priority tasks. Therefore, is it not a bit too late to
> > impose limits only after returning to user mode?
>
> Throttling is purely a CFS construct, and does not affect RT or DL
> tasks (outside of the lock contention issues we're trying to fix :-)). If
> an RT or DL task needs to run, it'll just preempt the CFS tasks, it won't
> wait for any throttle or other mechanism.


Dear Valentin,

Indeed, as you mentioned, delaying the throttle time does not affect the execution
of high-priority tasks when they want to run.

I should actually mark this as an RFC PATCH.
On one hand, it allows others using the rt-linux system who are troubled by issues
related to cgroup locking and limits to temporarily bypass these problems with my
patch.
On the other hand, I look forward experts may review the patch to see if there are
any serious or obvious issues that I may have overlooked, or if there are areas
for further improvement.


On Tue, 19 Aug 2025 15:06:56 +0200 Sebastian wrote:

> > Dear Sebastian,
> >
> > I believe what you mentioned is related to the same issue that Valentin
> > brought up later, which is the current solution of "delaying CPU throttling
> > through the task_work mechanism until returning to user mode."
> > My colleagues and I indeed noticed this from the beginning. However, on our
> > 6.1.134 RT-Linux system, we have tried new versions of this solution one by
> > one, but they have all failed during basic script tests, so none have reached
> > the stage of being used in our project. I see that this modification has been
> > promoted in the community for more than two years, yet it remains in a state
> > that doesn't work well (on our 6.1.134 RT-Linux system). I wonder if the
> > changes require too many considerations or if this modification simply isn't
> > suitable for running on RT-Linux. Our project cannot afford to wait, and
> > there are many performance issues in RT-Linux.
>
> You are free to use the patch.
> Based on your description I assume that the patch Valentin referenced
> will solve your problem. If not, it will be interesting to know why it
> is not working. Otherwise you keep maintaining your patch.

Dear Sebastian,

I am currently carefully reviewing the existing patches and applying them to our
current project version of rt-linux 6.1.134 to see if I can reproduce the previously
encountered issues. I will provide feedback on any problems as soon as possible.


Thanks
Xin Zhao