Re: [PATCH v2 07/23] sched/cache: Introduce per runqueue task LLC preference counter

From: Peter Zijlstra

Date: Thu Dec 11 2025 - 05:32:45 EST


On Wed, Dec 10, 2025 at 10:49:14AM -0800, Tim Chen wrote:
> On Wed, 2025-12-10 at 13:51 +0100, Peter Zijlstra wrote:
> > On Wed, Dec 03, 2025 at 03:07:26PM -0800, Tim Chen wrote:

> > Would it perhaps be easier to stick this thing in rq->sd rather than in
> > rq->nr_pref_llc. That way it automagically switches with the 'new'
> > domain. And then, with a bit of care, a singe load-balance pass should
> > see a consistent view (there should not be reloads of rq->sd -- which
> > will be a bit of an audit I suppose).
>
> We need nr_pref_llc information at the runqueue level because the load balancer 
> must identify which specific rq has the largest number of tasks that 
> prefer a given destination LLC. If we move the counter to the LLC’s sd 
> level, we would only know the aggregate number of tasks in the entire LLC 
> that prefer that destination—not which rq they reside on. Without per-rq 
> counts, we would not be able to select the correct source rq to pull tasks from.
>
> The only way this could work at the LLC-sd level is if all CPUs within 
> the LLC shared a single runqueue, which is not the case today.
>
> Let me know if I understand your comments correctly.

So the sched_domain instances are per-cpu (hence the need for
sched_domain_shared). So irrespective of what level you stick them at (I
was thinking the bottom most, but it really doesn't matter) they will be
per CPU.