Re: [PATCH v2 09/23] sched/cache: Count tasks prefering destination LLC in a sched group
From: Tim Chen
Date: Wed Dec 10 2025 - 14:04:57 EST
On Wed, 2025-12-10 at 16:16 +0100, Peter Zijlstra wrote:
> On Wed, Dec 10, 2025 at 11:05:33PM +0900, Chen, Yu C wrote:
> > On 12/10/2025 9:52 PM, Peter Zijlstra wrote:
> > > On Wed, Dec 03, 2025 at 03:07:28PM -0800, Tim Chen wrote:
> > > > During LLC load balancing, tabulate the number of tasks on each runqueue
> > > > that prefer the LLC contains the env->dst_cpu in a sched group.
> > > >
> > > > For example, consider a system with 4 LLC sched groups (LLC0 to LLC3)
> > > > balancing towards LLC3. LLC0 has 3 tasks preferring LLC3, LLC1 has
> > > > 2, and LLC2 has 1. LLC0, having the most tasks preferring LLC3, is
> > > > selected as the busiest source to pick tasks from.
> > > >
> > > > Within a source LLC, the total number of tasks preferring a destination
> > > > LLC is computed by summing counts across all CPUs in that LLC. For
> > > > instance, if LLC0 has CPU0 with 2 tasks and CPU1 with 1 task preferring
> > > > LLC3, the total for LLC0 is 3.
> > > >
> > > > These statistics allow the load balancer to choose tasks from source
> > > > sched groups that best match their preferred LLCs.
> > > >
> > > > Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> > > > ---
> > > >
> > > > Notes:
> > > > v1->v2:
> > > > Convert nr_pref_llc array in sg_lb_stats to a single
> > > > variable as only the dst LLC stat is needed.
> > > > (K Prateek Nayak)
> > > >
> > > > kernel/sched/fair.c | 12 ++++++++++++
> > > > 1 file changed, 12 insertions(+)
> > > >
> > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > > index b0e87616e377..4d7803f69a74 100644
> > > > --- a/kernel/sched/fair.c
> > > > +++ b/kernel/sched/fair.c
> > > > @@ -10445,6 +10445,9 @@ struct sg_lb_stats {
> > > > unsigned int nr_numa_running;
> > > > unsigned int nr_preferred_running;
> > > > #endif
> > > > +#ifdef CONFIG_SCHED_CACHE
> > > > + unsigned int nr_pref_llc;
> > > > +#endif
> > >
> > > At this point I have to note that rq->nr_pref_llc seems like a horrible
> > > misnomer, for it being an array, and not an actual number like the
> > > naming suggests.
> >
> > In the v2 it seems that rq->nr_pref_llc is not an array anymore, it
>
> From two patches ago:
>
> + unsigned int *nr_pref_llc;
>
> Its a pointer of some sort.
Perhaps I should used a different name here when I update this patch
for v2.
rq->nr_pref_llc[] is an array as it records the number of tasks preferring each LLC.
However
sgs->nr_pref_llc is a single number representing the number of tasks
preferring the current domain preferring the destination LLC.
Sorry for using the same name that may have created this confusion.
>
>
> > indicates
> > the number of tasks that want to be migrated to the env->dst_cpu (dst_llc),
> > because
> > these tasks' preferred LLC are env->dst_cpu(dst_llc). Maybe renaming it to
> > rq->nr_pref_dst_llc?
>
> Like I said in:
>
> https://lkml.kernel.org/r/20251210125114.GS3707891@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> it might make sense to put it in struct sched_domain instead of struct
> rq, since then you can allocate and swap it right along with the rest of
> the domain tree.
Sent a separate reply to that comment to clarify why I think we need nr_pref_llc[] per
run queue.
Tim