Re: [PATCH v2 4/4] sched/rt: Split cpupri_vec->cpumask to per NUMA node to reduce contention
From: K Prateek Nayak
Date: Wed Apr 08 2026 - 11:52:59 EST
Hello Chenyu,
On 4/8/2026 5:05 PM, Chen, Yu C wrote:
> We haven't tried breaking it down further. One possible approach
> is to partition it at L2 scope, the benefit of which may depend on
> the workload.
I fear at that point we'll have too many cachelines and too much
cache pollution when the CPU starts reading this at tick to schedule
a newidle balance.
A 128 core system would bring in 128 * 64B = 8kB worth of data to
traverse the mask and at that point it becomes a trade off between
how fast you want reads vs writes and does it even speed up writes
after a certain point?
Sorry I got distracted by some other stuff today but I'll share the
results from my experiments tomorrow.
--
Thanks and Regards,
Prateek