Re: [PATCH v2 4/4] sched/rt: Split cpupri_vec->cpumask to per NUMA node to reduce contention
From: K Prateek Nayak
Date: Fri Apr 03 2026 - 04:14:20 EST
Hello Chenyu,
On 4/3/2026 11:16 AM, Chen, Yu C wrote:
> On 4/2/2026 7:06 PM, K Prateek Nayak wrote:
>> Hello Peter,
>>
>> On 4/2/2026 4:25 PM, Peter Zijlstra wrote:
>>> On Thu, Apr 02, 2026 at 10:11:11AM +0530, K Prateek Nayak wrote:
>>>
>>>> It is still not super clear to me how the logic deals with more than
>>>> 128CPUs in a DIE domain because that'll need more than the u64 but
>>>> sbm_find_next_bit() simply does:
>>>>
>>>> tmp = leaf->bitmap & mask; /* All are u64 */
>>>>
>>>> expecting just the u64 bitmap to represent all the CPUs in the leaf.
>>>>
>>>> If we have, say 256 CPUs per DIE, we get shift(7) and arch_sbm_mask
>>>> as 7f (127) which allows a leaf to more than 64 CPUs but we are
>>>> using the "u64 bitmap" directly and not:
>>>>
>>>> find_next_bit(bitmap, arch_sbm_mask)
>>>>
>>>> Am I missing something here?
>>>
>>> Nope. That logic just isn't there, that was left as an exercise to the
>>> reader :-)
>>
>> Ack! Let me go fiddle with that.
>>
>
> Nice catch. I hadn't noticed this since we have fewer than
> 64 CPUs per die. Please feel free to send patches to me when
> they're available.
>
> And regarding your other question about the calculation of arch_sbm_shift,
> I'm trying to understand why there is a subtraction of 1, should it be:
> - arch_sbm_shift = x86_topo_system.dom_shifts[TOPO_DIE_DOMAIN] - 1;
> + arch_sbm_shift = x86_topo_system.dom_shifts[TOPO_DIE_DOMAIN - 1];
> ?
> Are we trying to filer the raw global unique die id? - similar to topo_apicid()
> which mask the lower x86_topo_system.dom_shifts[dom - 1]).
>
> With above change I can get a correct value of leaves (4) rather than (2) in
> the original version.
Thanks for confirming. I guess that would just be TOPO_TILE_DOMAIN then
and would work well on AMD too since that is where the CCX is mapped.
I'll get hold of a SPR / use a VM to confirm with 0x1f behavior.
I'll post the patches next week since I have to check with Andrea on how
the ARM systems have decided to number their SMT threads and whether
they requires separate plumbing for arch_sbm_idx_to_cpu(),
arch_sbm_cpu_to_idx() or not.
--
Thanks and Regards,
Prateek