Re: [PATCH v2 09/23] sched/cache: Count tasks prefering destination LLC in a sched group
From: Chen, Yu C
Date: Wed Dec 10 2025 - 18:52:24 EST
On 12/11/2025 12:16 AM, Peter Zijlstra wrote:
On Wed, Dec 10, 2025 at 11:05:33PM +0900, Chen, Yu C wrote:
On 12/10/2025 9:52 PM, Peter Zijlstra wrote:
On Wed, Dec 03, 2025 at 03:07:28PM -0800, Tim Chen wrote:
During LLC load balancing, tabulate the number of tasks on each runqueue
that prefer the LLC contains the env->dst_cpu in a sched group.
For example, consider a system with 4 LLC sched groups (LLC0 to LLC3)
balancing towards LLC3. LLC0 has 3 tasks preferring LLC3, LLC1 has
2, and LLC2 has 1. LLC0, having the most tasks preferring LLC3, is
selected as the busiest source to pick tasks from.
Within a source LLC, the total number of tasks preferring a destination
LLC is computed by summing counts across all CPUs in that LLC. For
instance, if LLC0 has CPU0 with 2 tasks and CPU1 with 1 task preferring
LLC3, the total for LLC0 is 3.
These statistics allow the load balancer to choose tasks from source
sched groups that best match their preferred LLCs.
Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
---
Notes:
v1->v2:
Convert nr_pref_llc array in sg_lb_stats to a single
variable as only the dst LLC stat is needed.
(K Prateek Nayak)
kernel/sched/fair.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b0e87616e377..4d7803f69a74 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10445,6 +10445,9 @@ struct sg_lb_stats {
unsigned int nr_numa_running;
unsigned int nr_preferred_running;
#endif
+#ifdef CONFIG_SCHED_CACHE
+ unsigned int nr_pref_llc;
+#endif
At this point I have to note that rq->nr_pref_llc seems like a horrible
misnomer, for it being an array, and not an actual number like the
naming suggests.
In the v2 it seems that rq->nr_pref_llc is not an array anymore, it
From two patches ago:
+ unsigned int *nr_pref_llc;
Its a pointer of some sort.
Ah I see, I thought it was the variable in the sgs structure.
indicates
the number of tasks that want to be migrated to the env->dst_cpu (dst_llc),
because
these tasks' preferred LLC are env->dst_cpu(dst_llc). Maybe renaming it to
rq->nr_pref_dst_llc?
Like I said in:
https://lkml.kernel.org/r/20251210125114.GS3707891@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
it might make sense to put it in struct sched_domain instead of struct
rq, since then you can allocate and swap it right along with the rest of
the domain tree.
I'll think more about this. Currently the per cpu rq's nr_pref_llc is
used to
identify the "busiest" runqueue. The busiest runqueue has most threads
wanted
to be migrated to llc_id(env->dst_cpu), because the threads' preferred
LLC is
there - in this way, the migration success ratio to the preferred LLC
would be
higher without breaking the imbalance too much IMHO. So we might have to
track
the per cpu rq's statistics during enqueue/dequeue. If we put it in the
domain,
not sure how to track that.
Thanks,
Chenyu