Re: [PATCH v3 04/21] sched/cache: Make LLC id continuous

From: Tim Chen

Date: Tue Feb 17 2026 - 16:20:24 EST

On Tue, 2026-02-17 at 15:35 +0530, Madadi Vineeth Reddy wrote:
> On 15/02/26 19:55, Chen, Yu C wrote:
> > On 2/15/2026 1:53 AM, Madadi Vineeth Reddy wrote:
> > > On 11/02/26 03:48, Tim Chen wrote:
> > > > From: Chen Yu <yu.c.chen@xxxxxxxxx>
> > > >
> > > > Introduce an index mapping between CPUs and their LLCs. This provides
> > > > a continuous per LLC index needed for cache-aware load balancing in
> > > > later patches.
> > > >
> > > > The existing per_cpu llc_id usually points to the first CPU of the
> > > > LLC domain, which is sparse and unsuitable as an array index. Using
> > > > llc_id directly would waste memory.
> > > >
> > > > With the new mapping, CPUs in the same LLC share a continuous id:
> > > >
> > > >    per_cpu(llc_id, CPU=0...15) = 0
> > > >    per_cpu(llc_id, CPU=16...31) = 1
> > > >    per_cpu(llc_id, CPU=32...47) = 2
> > > >    ...
> > > >
> > > > Once a CPU has been assigned an llc_id, this ID persists even when
> > > > the CPU is taken offline and brought back online, which can facilitate
> > > > the management of the ID.
> > >
> > > tl_max_llcs is never reset across multiple invocations of build_sched_domains().
> > > While this preserves LLC IDs across normal CPU hotplug events, I'm wondering about
> > > scenarios where hardware topology changes, such as physically removing/replacing
> > > CPU sockets.
> > >
> > > Example scenario:
> > > Boot with 3 LLCs: IDs {0,1,2}, tl_max_llcs=3
> > > Physical hardware change removes LLC 1
> > > New hardware added at a different position gets ID=3
> > > After multiple such events: System has 4 LLCs but IDs {0,2,5,7}, tl_max_llcs=8
> > >
> >
> > I agree that keeping tl_max_llcs non-decreasing might waste some space. The
> > original motivation for introducing a dynamic sd_llc_id was mainly that a
> > static sd_llc_id[NR_LLC] is not suitable, as we cannot find a proper upper
> > limit for NR_LLC-an arbitrary value for NR_LLC is unacceptable. That is to
> > say, tl_max_llcs serves as the historical maximum LLC index that has ever
> > been detected - like other terms such as CPU id. It is possible that the
> > number of available LLCs shrinks due to CPU offline after boot-up. A value
> > of tl_max_llcs=8 indicates that this system once had 8 valid LLCs. On the
> > other hand, dense mapping is a side effect of dynamically allocating sd_llc_id.
> >
> > > This creates gaps in the ID space. However, I understand this trade-off might be
> > > intentional since physical topology changes are rare, and resetting tl_max_llcs and
> > > all sd_llc_id values would rebuild IDs on every invocation of build_sched_domains().
> > >
> > > Would like to know your thoughts on overhead of resetting tl_max_llcs and sd_llc_id
> > > so that IDs are rebuilt on each invocation of build_sched_domains() to always maintain
> > > a dense mapping.
> > >
> >
> > The current implementation is intentionally kept simple for easier review, and
> > I agree that strictly enforcing a dense mapping for sd_llc_id - by recalculating
> > the actual maximum LLC count (max_llcs) whenever the CPU topology changes - could
> > be an optimization direction once the basic version has been accepted. I assume what
> > you are suggesting is that we could reset tl_max_llcs/max_llcs/sd_llc_id for CPUs
> > in doms_new[i] within partition_sched_domains_locked() - and then rebuild these
> > values in build_sched_domains() accordingly. One risk here is a race condition when
> > modifying the llc_id of a specific CPU - but off the top of my head, valid_llc_buf()
> > should help prevent out-of-range access to sd->pf caused by such races.
> > Thoughts?
>
> Yes, resetting and rebuilding would maintain dense mapping. Given the added complexity
> of race conditions vs. minimal benefit (gaps only occur with physical topology changes),
> I think the current approach is better. We can revisit it once this version goes through.
>

The current implementation keep LLC id unchanged across sched domain rebuild.
The idea was to allow pf[id] to be kept across rebuilds, and point to
the same LLC.

That said, now that we clear pf[id] across sched domain rebuild, this constraint can
be relaxed. And it should be okay to change the LLC id from the perspective of cache
aware scheduling.

However, there could be some transient races with cpus_share_cache() while the
LLC id got changed, which the current implementation avoid.

Tim