Re: [Patch v4 00/16] Cache aware scheduling enhancements

From: K Prateek Nayak

Date: Tue May 19 2026 - 23:07:25 EST

Hello Tim, Chenyu,

On 5/14/2026 2:09 AM, Tim Chen wrote:
> This patch set contains cache-aware scheduling enhancements
> and bug fixes on top of Peter's sched/cache branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=sched/cache

I took the latest queue:sched/core for a spin before and after
the sched/cache merge and everything is now looking fine.

I've temporary lost access to my usual test machine so I could only grab
the microbenchmark data but they are mostly positive to unaffected as
expected. I'll update if I see anything funky with longer running
benchmarks if and when I get a chance.

Following is the data from a dual socket Zen4c system (2 x 128C/256T)
with 32 LLCs in total:

o Kernels:

tip: queue:sched/core at commit dd29c017aed6 ("sched/rt: Have
RT_PUSH_IPI be default off for non PREEMPT_RT")

sched-cache: queue:sched/core at commit a26d9208c137 ("Merge branch
'sched/cache'")

o Benchmark results

==================================================================
Test : hackbench
Units : Normalized time in seconds
Interpretation: Lower is better
Statistic : AMean
==================================================================
Case: tip[pct imp](CV) sched_cache[pct imp](CV)
1-groups 1.00 [ -0.00]( 9.66) 0.92 [ 8.04](14.93)
2-groups 1.00 [ -0.00]( 9.22) 0.88 [ 11.96](12.53)
4-groups 1.00 [ -0.00]( 2.14) 0.99 [ 0.93]( 1.55)
8-groups 1.00 [ -0.00]( 2.80) 1.00 [ 0.22]( 3.96)
16-groups 1.00 [ -0.00]( 5.54) 1.00 [ -0.49]( 2.76)

==================================================================
Test : tbench
Units : Normalized throughput
Interpretation: Higher is better
Statistic : AMean
==================================================================
Clients: tip[pct imp](CV) sched_cache[pct imp](CV)
1 1.00 [ 0.00]( 0.03) 1.00 [ 0.30]( 0.29)
2 1.00 [ 0.00]( 0.32) 1.00 [ -0.45]( 1.86)
4 1.00 [ 0.00]( 0.34) 1.00 [ 0.38]( 0.14)
8 1.00 [ 0.00]( 0.24) 1.01 [ 0.56]( 0.34)
16 1.00 [ 0.00]( 0.45) 1.00 [ 0.12]( 0.05)
32 1.00 [ 0.00]( 0.58) 1.01 [ 1.27]( 0.58)
64 1.00 [ 0.00]( 0.81) 1.01 [ 1.32]( 0.16)
128 1.00 [ 0.00]( 0.53) 1.03 [ 3.27]( 1.15)
256 1.00 [ 0.00]( 0.30) 1.02 [ 2.14]( 0.64)
512 1.00 [ 0.00]( 3.73) 1.01 [ 1.00]( 2.73)
1024 1.00 [ 0.00]( 0.23) 0.99 [ -0.53]( 0.29)
2048 1.00 [ 0.00]( 0.14) 0.99 [ -0.73]( 0.37)

==================================================================
Test : stream-10
Units : Normalized Bandwidth, MB/s
Interpretation: Higher is better
Statistic : HMean
==================================================================
Test: tip[pct imp](CV) sched_cache[pct imp](CV)
Copy 1.00 [ 0.00]( 0.66) 1.00 [ 0.04]( 0.43)
Scale 1.00 [ 0.00]( 0.89) 1.00 [ 0.17]( 0.70)
Add 1.00 [ 0.00]( 0.73) 1.00 [ 0.08]( 0.73)
Triad 1.00 [ 0.00]( 0.70) 1.00 [ 0.04]( 0.75)

==================================================================
Test : stream-100
Units : Normalized Bandwidth, MB/s
Interpretation: Higher is better
Statistic : HMean
==================================================================
Test: tip[pct imp](CV) sched_cache[pct imp](CV)
Copy 1.00 [ 0.00]( 0.32) 1.00 [ -0.25]( 1.49)
Scale 1.00 [ 0.00]( 0.26) 0.99 [ -0.50]( 1.56)
Add 1.00 [ 0.00]( 0.29) 0.99 [ -0.69]( 1.22)
Triad 1.00 [ 0.00]( 0.27) 0.99 [ -0.71]( 1.24)

==================================================================
Test : netperf
Units : Normalized Througput
Interpretation: Higher is better
Statistic : AMean
==================================================================
Clients: tip[pct imp](CV) sched_cache[pct imp](CV)
1-clients 1.00 [ 0.00]( 0.10) 1.00 [ -0.08]( 0.13)
2-clients 1.00 [ 0.00]( 0.29) 1.00 [ -0.01]( 0.16)
4-clients 1.00 [ 0.00]( 0.36) 1.00 [ -0.25]( 0.21)
8-clients 1.00 [ 0.00]( 0.32) 1.00 [ -0.28]( 0.16)
16-clients 1.00 [ 0.00]( 0.24) 1.00 [ -0.38]( 0.24)
32-clients 1.00 [ 0.00]( 0.42) 1.00 [ -0.46]( 0.49)
64-clients 1.00 [ 0.00]( 0.94) 1.00 [ -0.40]( 0.65)
128-clients 1.00 [ 0.00]( 1.10) 1.00 [ -0.08]( 0.89)
256-clients 1.00 [ 0.00]( 1.06) 1.00 [ -0.10]( 0.97)
512-clients 1.00 [ 0.00]( 4.68) 0.98 [ -1.56]( 4.53)
768-clients 1.00 [ 0.00](34.35) 0.98 [ -2.03](32.96)
1024-clients 1.00 [ 0.00](42.76) 0.98 [ -1.74](43.29)

==================================================================
Test : schbench
Units : Normalized 99th percentile latency in us
Interpretation: Lower is better
Statistic : Median
==================================================================
#workers: tip[pct imp](CV) sched_cache[pct imp](CV)
1 1.00 [ -0.00](18.94) 0.39 [ 61.36]( 8.81)
2 1.00 [ -0.00]( 1.67) 0.91 [ 8.57](12.48)
4 1.00 [ -0.00]( 9.79) 0.70 [ 29.73](11.76)
8 1.00 [ -0.00]( 2.27) 0.82 [ 18.18]( 6.19)
16 1.00 [ -0.00]( 0.00) 0.98 [ 1.79]( 1.82)
32 1.00 [ -0.00]( 1.92) 1.00 [ -0.00]( 0.72)
64 1.00 [ -0.00]( 1.19) 1.02 [ -1.56]( 0.77)
128 1.00 [ -0.00]( 0.67) 1.00 [ -0.00]( 0.44)
256 1.00 [ -0.00]( 0.46) 1.01 [ -0.88]( 1.08)
512 1.00 [ -0.00]( 0.33) 0.97 [ 2.64]( 2.07)
768 1.00 [ -0.00]( 4.69) 1.02 [ -1.55]( 2.51)
1024 1.00 [ -0.00]( 2.71) 1.05 [ -4.72]( 1.36)

==================================================================
Test : new-schbench-requests-per-second
Units : Normalized Requests per second
Interpretation: Higher is better
Statistic : Median
==================================================================
#workers: tip[pct imp](CV) sched_cache[pct imp](CV)
1 1.00 [ 0.00]( 0.15) 0.99 [ -0.59]( 0.15)
2 1.00 [ 0.00]( 0.00) 0.99 [ -0.59]( 0.15)
4 1.00 [ 0.00]( 0.00) 1.00 [ -0.29]( 0.15)
8 1.00 [ 0.00]( 0.15) 1.00 [ 0.00]( 0.00)
16 1.00 [ 0.00]( 0.15) 1.00 [ 0.00]( 0.00)
32 1.00 [ 0.00]( 0.15) 1.00 [ -0.29]( 0.00)
64 1.00 [ 0.00]( 0.00) 1.00 [ 0.00]( 0.00)
128 1.00 [ 0.00](12.53) 0.99 [ -0.59](13.81)
256 1.00 [ 0.00]( 0.15) 1.00 [ -0.28]( 0.51)
512 1.00 [ 0.00]( 0.84) 1.01 [ 0.75]( 1.02)
768 1.00 [ 0.00]( 2.05) 1.01 [ 1.18]( 1.25)
1024 1.00 [ 0.00]( 2.90) 0.98 [ -1.62]( 1.25)

==================================================================
Test : new-schbench-wakeup-latency
Units : Normalized 99th percentile latency in us
Interpretation: Lower is better
Statistic : Median
==================================================================
#workers: tip[pct imp](CV) sched_cache[pct imp](CV)
1 1.00 [ -0.00](12.99) 1.33 [-33.33](31.03)
2 1.00 [ -0.00]( 4.08) 0.77 [ 23.08]( 5.34)
4 1.00 [ -0.00]( 0.00) 0.82 [ 18.18]( 5.53)
8 1.00 [ -0.00]( 0.00) 0.91 [ 9.09]( 0.00)
16 1.00 [ -0.00]( 4.56) 1.00 [ -0.00]( 4.84)
32 1.00 [ -0.00]( 0.00) 0.91 [ 9.09]( 0.00)
64 1.00 [ -0.00]( 5.00) 1.00 [ -0.00]( 5.00)
128 1.00 [ -0.00]( 7.45) 1.17 [-16.67](18.75)
256 1.00 [ -0.00]( 2.70) 1.02 [ -2.49]( 5.07)
512 1.00 [ -0.00]( 0.00) 1.00 [ -0.00]( 0.00)
768 1.00 [ -0.00]( 1.66) 1.02 [ -2.44]( 1.30)
1024 1.00 [ -0.00]( 3.32) 1.01 [ -1.19]( 1.92)

==================================================================
Test : new-schbench-request-latency
Units : Normalized 99th percentile latency in us
Interpretation: Lower is better
Statistic : Median
==================================================================
#workers: tip[pct imp](CV) sched_cache[pct imp](CV)
1 1.00 [ -0.00]( 0.14) 1.01 [ -0.80]( 0.41)
2 1.00 [ -0.00]( 0.14) 1.02 [ -1.60]( 0.27)
4 1.00 [ -0.00]( 0.00) 1.01 [ -1.07]( 0.68)
8 1.00 [ -0.00]( 0.14) 1.01 [ -0.80]( 0.00)
16 1.00 [ -0.00]( 1.49) 0.98 [ 1.82]( 0.00)
32 1.00 [ -0.00]( 0.89) 0.99 [ 0.53]( 0.27)
64 1.00 [ -0.00]( 1.43) 1.00 [ -0.26]( 1.22)
128 1.00 [ -0.00]( 2.78) 1.01 [ -0.89]( 3.06)
256 1.00 [ -0.00]( 0.13) 1.00 [ -0.00]( 0.13)
512 1.00 [ -0.00]( 6.72) 1.07 [ -6.59]( 8.20)
768 1.00 [ -0.00]( 3.42) 1.05 [ -4.61]( 2.67)
1024 1.00 [ -0.00]( 4.37) 0.99 [ 1.43]( 2.40)
---

Thanks a ton! And sorry for not having been the most responsive on the
latest iterations.

--
Thanks and Regards,
Prateek