Re: [Patch v4 00/22] Cache aware scheduling
From: Qais Yousef
Date: Fri Apr 24 2026 - 20:14:49 EST
On 04/24/26 01:17, Chen, Yu C wrote:
> On 4/21/2026 8:34 AM, Qais Yousef wrote:
> > On 04/20/26 17:01, Chen, Yu C wrote:
> > > On 4/16/2026 8:27 AM, Qais Yousef wrote:
> > > > On 04/01/26 14:52, Tim Chen wrote:
> > >
> > > [ ... ]
> > >
> > > It seems to me that there are multiple use cases. In one scenario,
> > > the administrator (including daemons) is responsible for tagging
> > > workloads. In another, users prefer the OS to handle automatic
> > > placement without any userspace involvement.
> >
> > How do you define this automatic placement? AFAICS you're just grouping all
> > tasks of a specific process to stay within the same LLC and hitting overcommit
> > issues which you're workingaround with this load balancer only based approach?
> >
> > I think in practice there will be many corner cases where state is not optimal
> > and we'd end up with heuristics to 'balance' things out and sensitivity to
> > independent changes disturbing this fragile balance causing weird regressions
> > and us slowly has less flexibility to move and shuffle code (okay, maybe too
> > much doom and gloom, but we've been by this in the past :)).
> >
> > I am not sure how many of these tests stressed the system with multiple
> > critical processes running concurrently?
> >
>
> In the initial RFC patches, we ran multi-process tests,
> where workloads were assigned by cache-aware LB to dedicated
> LLCs when under-loaded. I just conducted additional
> multi-process hackbench tests, and the results demonstrate
> improved stabilization with cache-aware LB enabled. Thus,
> I think for multi-process cases, there is no difference from
> single-process cases - the tasks can be aggregated to one LLC
> as long as it is under-loaded, no matter what process this
> migrating task belongs to.
Multi as in > num_llcs?
Doing my own tests with schedqos and monitor the schedqos log, I was surprised
how many processes are created from simplest of operations.
My worry is that since you assume all processes must be grouped, in real life
scenario you will end up with processes > num_llc in many corner cases.
With opt-in approach, you know exactly how many there will be and admins can
design for it.
>
> > By making it a userspace problem they have to figure out the right balance and
> > we can focus on providing the right mechanism.
> >
>
> I totally agree that with the help from userspace, the task aggregation
> would become more usable. The test data would speak. Once we have resolved
> the issues reported by Sashiko we will evaluate the schedqos provided
> interface.
Great. Happy to work closely with you to help iron out problems.