Re: [PATCH 3/3] x86/sbm: Derive leaf granularity from LLC cacheinfo instead of topology domain
From: Chen, Yu C
Date: Tue May 12 2026 - 05:30:14 EST
Hi Prateek,
On 5/11/2026 3:48 PM, K Prateek Nayak wrote:
- if (!amd_fill_cpuid4_info(llc_index, &id4))
+ if (!amd_fill_cpuid4_info(llc_index, &id4)) {
c->topo.llc_id = get_cache_id(c->topo.apicid, &id4);
+ if (c == &boot_cpu_data)
+ arch_sbm_shift = get_count_order(1 + id4.eax.split.num_threads_sharing);
+ }
So I'm slightly skeptical on AMD's heterogenous processors based systems
getting this right but I have to get my hands on one to confirm.
Either ways, it seems like an AMD specific problem that I'll chase down
if it exists but this should be fine from testing perspective on your
system.
Right, this reminds me that Intel's heterogeneous (hybrid) processors
might also need to be accounted for if the platform has multiple LLCs.
struct sbm *sbm_alloc(void)
{
- unsigned int nr = arch_sbm_leafs;
- unsigned int nbits = 1U << arch_sbm_shift;
- unsigned int nlongs = BITS_TO_LONGS(nbits);
- struct sbm_root *root = kzalloc_flex(*root, leafs, nr);
+ unsigned int nr;
+ unsigned int nbits;
+ unsigned int nlongs;
+ struct sbm_root *root;
struct sbm_leaf *leaf;
+
+ if (!arch_sbm_shift) {
+ unsigned int max_idx = num_possible_cpus();
+
+ /*
+ * unsigned long is the base unit for bitmap in sbm_leaf.
+ * Use that for default bitmap size for compact bitmap
+ * without unused bits.
+ */
+ arch_sbm_shift = BYTES_TO_BITS(sizeof(unsigned long));
+ arch_sbm_leafs = 1 + (max_idx >> arch_sbm_shift);
+ arch_sbm_mask = (1 << arch_sbm_shift) - 1;
+ arch_sbm_bits = arch_sbm_shift;
Side note:
So while chasing sbitmap, I realized there are some users of sbitmap out
Thanks for pointing it out. I took a look at sbitmap; it seems to
provide cache-friendly bit allocation strategy for different CPUs.
This seems to be a bit different usage model from sbm, which aims
to provide a 1:1 mapping between a CPU and its corresponding bit
in a mask in a cache-friendly manner. That said, the allocation
logic could be reused between sbitmap and sbm IMO.
there that are essentially using its minimal functionality that smb
provides and can be converted over to save an extra cacheline worth of
overhead.
Does it make sense to keep the arch_sbm_* stuff specific to the
scheduler and allow wider use of sbm for any sparse bitmap usage?
Yes, I think this is feasible. We can introduce
struct sbm *sbm_alloc(unsigned int max_idx, unsigned int leaf_bits)
to allow reuse by other non-scheduler components.
+ }
+
+ nr = arch_sbm_leafs;
+ nbits = 1U << arch_sbm_shift;
+ nlongs = BITS_TO_LONGS(nbits);
+ root = kzalloc_flex(*root, leafs, nr);
if (!root)
return NULL;
My QEMU has suddenly refused to boot after the conversion to cache
properties leaf changes so I'll try to see why that is the case.
Thanks, I haven't tested on VM yet, let me take a look.
thanks,
Chenyu