Re: [PATCH v2 4/4] sched/rt: Split cpupri_vec->cpumask to per NUMA node to reduce contention

From: K Prateek Nayak

Date: Wed Apr 08 2026 - 11:52:59 EST


Hello Chenyu,

On 4/8/2026 5:05 PM, Chen, Yu C wrote:
> We haven't tried breaking it down further. One possible approach
> is to partition it at L2 scope, the benefit of which may depend on
> the workload.

I fear at that point we'll have too many cachelines and too much
cache pollution when the CPU starts reading this at tick to schedule
a newidle balance.

A 128 core system would bring in 128 * 64B = 8kB worth of data to
traverse the mask and at that point it becomes a trade off between
how fast you want reads vs writes and does it even speed up writes
after a certain point?

Sorry I got distracted by some other stuff today but I'll share the
results from my experiments tomorrow.

--
Thanks and Regards,
Prateek