Re: [PATCH 02/19] sched/fair: Record per-LLC utilization to guide cache-aware scheduling decisions

From: Chen, Yu C

Date: Mon Oct 27 2025 - 10:20:29 EST


Hi Prateek,

On 10/27/2025 1:01 PM, K Prateek Nayak wrote:
Hello Tim,

On 10/11/2025 11:54 PM, Tim Chen wrote:
+#ifdef CONFIG_SCHED_CACHE
+/*
+ * Record the statistics for this scheduler group for later
+ * use. These values guide load balancing on aggregating tasks
+ * to a LLC.
+ */
+static void record_sg_llc_stats(struct lb_env *env,
+ struct sg_lb_stats *sgs,
+ struct sched_group *group)
+{
+ /*
+ * Find the child domain on env->dst_cpu. This domain
+ * is either the domain that spans this group(if the
+ * group is a local group), or the sibling domain of
+ * this group.
+ */
+ struct sched_domain *sd = env->sd->child;

Was this intentionally done to limit the update to sg_llc_stats to the
load balancing period of "sd_llc->parent"?

Can't this be done with update_idle_cpu_scan()? I believe it is more
frequent, "sds->total_capacity" from caller gives you the equivalent of
"group_capacity", and "group_util" is already calculated as "sum_util".

Checking "sd_llc->parent" there should be sufficient to check if there
are multiple LLC domains or not. Thoughts?


The original idea was to calculate the statistics for the CPUs within
one LLC, and set the tag for that sched group as well as its sg_lb_stats
(but not at the sched domain scope). With this flag set in that sched group,
we can perform some comparisons in update_sd_pick_busiest() to determine if
that sched group has any tasks that need to be moved to other LLC sched groups.
If we do this in update_idle_cpu_scan(), might it be a bit late for
update_sd_pick_busiest()?

thanks,
Chenyu

+ struct sched_domain_shared *sd_share;
+
+ if (!sched_feat(SCHED_CACHE) || env->idle == CPU_NEWLY_IDLE)
+ return;
+
+ /* only care about sched domains spanning a LLC */
+ if (sd != rcu_dereference(per_cpu(sd_llc, env->dst_cpu)))
+ return;
+
+ /*
+ * At this point we know this group spans a LLC domain.
+ * Record the statistic of this group in its corresponding
+ * shared LLC domain.
+ */
+ sd_share = rcu_dereference(per_cpu(sd_llc_shared,
+ cpumask_first(sched_group_span(group))));
+ if (!sd_share)
+ return;
+
+ if (READ_ONCE(sd_share->util_avg) != sgs->group_util)
+ WRITE_ONCE(sd_share->util_avg, sgs->group_util);
+
+ if (unlikely(READ_ONCE(sd_share->capacity) != sgs->group_capacity))
+ WRITE_ONCE(sd_share->capacity, sgs->group_capacity);
+}