Re: [PATCH v3 00/21] Cache Aware Scheduling

From: Qais Yousef

Date: Thu Feb 19 2026 - 09:08:40 EST

On 02/10/26 14:18, Tim Chen wrote:
> This patch series introduces infrastructure for cache-aware load
> balancing, with the goal of co-locating tasks that share data within
> the same Last Level Cache (LLC) domain. By improving cache locality,
> the scheduler can reduce cache bouncing and cache misses, ultimately
> improving data access efficiency. The design builds on the initial
> prototype from Peter [1].
>
> This initial implementation treats threads within the same process as
> entities that are likely to share data. During load balancing, the

This is a very aggressive assumption. From what I've seen, only few tasks truly
share data. Lumping everything in a process together is an easy way to
classify, but I think we can do better.

> scheduler attempts to aggregate such threads onto the same LLC domain
> whenever possible.

I admit yet to look fully at the series. But I must ask, why are you deferring
to load balance and not looking at wake up path? LB should be for corrections.
When wake up path is doing wrong decision all the time, LB (which is super slow
to react) is too late to start grouping tasks? What am I missing?

In my head Core Scheduling is already doing what we want. We just need to
extend it to be a bit more relaxed (best effort rather than completely strict
for security reasons today). This will be a lot more flexible and will allow
tasks to be co-located from the get-go. And it will defer the responsibility of
tagging to userspace. If they do better or worse, it's on them :) It seems you
already hit a corner case where the grouping was a bad idea and doing some
magic with thread numbers to alleviate it.

FWIW I have come across cases on mobile world were co-locating on a cluster or
a 'big' core with big L2 cache can benefit a small group of tasks. So the
concept is generally beneficial as cache hierarchies are not symmetrical in
more systems now. Even on symmetrical systems, there can be cases made where
two small data dependent task can benefit from packing on a single CPU.

I know this changes the direction being made here; but I strongly believe the
right way is to extend wake up path rather than lump it solely in LB (IIUC).

Note I am looking at NETLINK to enable our proposed Sched QoS library to listen
to critical events like a process being created and tasks being forked to auto
tag them. Userspace would be easily able to tag individual tasks as
co-dependent or ask for a whole process to be tagged as such (assign the same
cookie to all forked tasks for that process). We should not need to do any
magic in the kernel then other than provide the mechanisms to shoot themselves
in the foot (or do better ;-))