Re: [PATCH v3 00/21] Cache Aware Scheduling

From: Qais Yousef

Date: Thu Feb 19 2026 - 22:29:54 EST


On 02/19/26 10:11, Tim Chen wrote:
> On Thu, 2026-02-19 at 23:07 +0800, Chen, Yu C wrote:
> > Hi Peter, Qais,
> >
> > On 2/19/2026 10:41 PM, Peter Zijlstra wrote:
> > > On Thu, Feb 19, 2026 at 02:08:28PM +0000, Qais Yousef wrote:
> > > > On 02/10/26 14:18, Tim Chen wrote:
> >
> > [ ... ]
> >
> > > >
> > > > I admit yet to look fully at the series. But I must ask, why are you deferring
> > > > to load balance and not looking at wake up path? LB should be for corrections.
> > > > When wake up path is doing wrong decision all the time, LB (which is super slow
> > > > to react) is too late to start grouping tasks? What am I missing?
> > >
> > > There used to be wakeup steering, but I'm not sure that still exists in
> > > this version (still need to read beyond the first few patches). It isn't
> > > hard to add.
> > >
> >
> > Please let me explain a little more about why we did this in the
> > load balance path. Yes, the original version implemented cache-aware
> > scheduling only in the wakeup path. According to our testing, this appeared
> > to cause some task bouncing issues across LLCs. This was due to conflicts
> > with the legacy load balancer, which tries to spread tasks to different
> > LLCs.
> > So as Peter said, the load balancer should be taken care of anyway. Later,
> > we kept only the cache aware logic in the load balancer, and the test
> > results
> > became much more stable, so we kept it as is. The wakeup path more or less
> > aggregates the wakees(threads within the same process) within the LLC in
> > the
> > wakeup fast path, so we have not changed it for now.
> >
> > Let me copy the changelog from the previous patch version:
> >
> > "
> > In previous versions, aggregation of tasks were done in the
> > wake up path, without making load balancing paths aware of
> > LLC (Last-Level-Cache) preference. This led to the following
> > problems:
> >
> > 1) Aggregation of tasks during wake up led to load imbalance
> > between LLCs
> > 2) Load balancing tried to even out the load between LLCs
> > 3) Wake up tasks aggregation happened at a faster rate and
> > load balancing moved tasks in opposite directions, leading
> > to continuous and excessive task migrations and regressions
> > in benchmarks like schbench.
> >
> > In this version, load balancing is made cache-aware. The main
> > idea of cache-aware load balancing consists of two parts:
> >
> > 1) Identify tasks that prefer to run on their hottest LLC and
> > move them there.
> > 2) Prevent generic load balancing from moving a task out of
> > its hottest LLC.
> > "
> >
>
> Another reason why we moved away from doing things in the wake up
> path is load imbalance consideration. Wake up path does not have
> the most up to date load information in the LLC sched domains as
> in the load balance path. So you may actually have everyone rushed

What's the reason wake up doesn't have the latest info? Is this a limitation of
these large systems where stats updates are too expensive to do? Is it not
fixable at all?

> into each's favorite LLC and causes LLC overload. And load balance
> will have to undo this. This led to frequent task migrations that
> hurts performance.
>
> It is better to consider LLC preference in the load balance path
> so we can aggregate tasks while still keeping load imbalance under
> control.
>
> Tim