Re: [PATCH v3 04/21] sched/cache: Make LLC id continuous

From: Madadi Vineeth Reddy

Date: Sat Feb 14 2026 - 12:55:07 EST


On 11/02/26 03:48, Tim Chen wrote:
> From: Chen Yu <yu.c.chen@xxxxxxxxx>
>
> Introduce an index mapping between CPUs and their LLCs. This provides
> a continuous per LLC index needed for cache-aware load balancing in
> later patches.
>
> The existing per_cpu llc_id usually points to the first CPU of the
> LLC domain, which is sparse and unsuitable as an array index. Using
> llc_id directly would waste memory.
>
> With the new mapping, CPUs in the same LLC share a continuous id:
>
> per_cpu(llc_id, CPU=0...15) = 0
> per_cpu(llc_id, CPU=16...31) = 1
> per_cpu(llc_id, CPU=32...47) = 2
> ...
>
> Once a CPU has been assigned an llc_id, this ID persists even when
> the CPU is taken offline and brought back online, which can facilitate
> the management of the ID.

tl_max_llcs is never reset across multiple invocations of build_sched_domains().
While this preserves LLC IDs across normal CPU hotplug events, I'm wondering about
scenarios where hardware topology changes, such as physically removing/replacing
CPU sockets.

Example scenario:
Boot with 3 LLCs: IDs {0,1,2}, tl_max_llcs=3
Physical hardware change removes LLC 1
New hardware added at a different position gets ID=3
After multiple such events: System has 4 LLCs but IDs {0,2,5,7}, tl_max_llcs=8

This creates gaps in the ID space. However, I understand this trade-off might be
intentional since physical topology changes are rare, and resetting tl_max_llcs and
all sd_llc_id values would rebuild IDs on every invocation of build_sched_domains().

Would like to know your thoughts on overhead of resetting tl_max_llcs and sd_llc_id
so that IDs are rebuilt on each invocation of build_sched_domains() to always maintain
a dense mapping.

Thanks,
Vineeth

>
> Co-developed-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx>
> Co-developed-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> Signed-off-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> Signed-off-by: Chen Yu <yu.c.chen@xxxxxxxxx>
> ---
>
> Notes:
> v2->v3:
> Allocate the LLC id according to the topology level data directly, rather
> than calculating from the sched domain. This simplifies the code.
> (Peter Zijlstra, K Prateek Nayak)
>
> kernel/sched/topology.c | 47 ++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 44 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index cf643a5ddedd..ca46b5cf7f78 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -20,6 +20,7 @@ void sched_domains_mutex_unlock(void)
> /* Protected by sched_domains_mutex: */
> static cpumask_var_t sched_domains_tmpmask;
> static cpumask_var_t sched_domains_tmpmask2;
> +static int tl_max_llcs;
>
> static int __init sched_debug_setup(char *str)
> {
> @@ -658,7 +659,7 @@ static void destroy_sched_domains(struct sched_domain *sd)
> */
> DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc);
> DEFINE_PER_CPU(int, sd_llc_size);
> -DEFINE_PER_CPU(int, sd_llc_id);
> +DEFINE_PER_CPU(int, sd_llc_id) = -1;
> DEFINE_PER_CPU(int, sd_share_id);
> DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared);
> DEFINE_PER_CPU(struct sched_domain __rcu *, sd_numa);
> @@ -684,7 +685,6 @@ static void update_top_cache_domain(int cpu)
>
> rcu_assign_pointer(per_cpu(sd_llc, cpu), sd);
> per_cpu(sd_llc_size, cpu) = size;
> - per_cpu(sd_llc_id, cpu) = id;
> rcu_assign_pointer(per_cpu(sd_llc_shared, cpu), sds);
>
> sd = lowest_flag_domain(cpu, SD_CLUSTER);
> @@ -2567,10 +2567,18 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
>
> /* Set up domains for CPUs specified by the cpu_map: */
> for_each_cpu(i, cpu_map) {
> - struct sched_domain_topology_level *tl;
> + struct sched_domain_topology_level *tl, *tl_llc = NULL;
> + int lid;
>
> sd = NULL;
> for_each_sd_topology(tl) {
> + int flags = 0;
> +
> + if (tl->sd_flags)
> + flags = (*tl->sd_flags)();
> +
> + if (flags & SD_SHARE_LLC)
> + tl_llc = tl;
>
> sd = build_sched_domain(tl, cpu_map, attr, sd, i);
>
> @@ -2581,6 +2589,39 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
> if (cpumask_equal(cpu_map, sched_domain_span(sd)))
> break;
> }
> +
> + lid = per_cpu(sd_llc_id, i);
> + if (lid == -1) {
> + int j;
> +
> + /*
> + * Assign the llc_id to the CPUs that do not
> + * have an LLC.
> + */
> + if (!tl_llc) {
> + per_cpu(sd_llc_id, i) = tl_max_llcs++;
> +
> + continue;
> + }
> +
> + /* try to reuse the llc_id of its siblings */
> + for_each_cpu(j, tl_llc->mask(tl_llc, i)) {
> + if (i == j)
> + continue;
> +
> + lid = per_cpu(sd_llc_id, j);
> +
> + if (lid != -1) {
> + per_cpu(sd_llc_id, i) = lid;
> +
> + break;
> + }
> + }
> +
> + /* a new LLC is detected */
> + if (lid == -1)
> + per_cpu(sd_llc_id, i) = tl_max_llcs++;
> + }
> }
>
> if (WARN_ON(!topology_span_sane(cpu_map)))