Re: [PATCH 1/2] sched/fair: Couple wakee flips with heavy wakers
From: Mike Galbraith
Date: Tue Oct 26 2021 - 08:14:29 EST
On Tue, 2021-10-26 at 12:57 +0100, Mel Gorman wrote:
>
> The patch in question was also tested on other workloads on NUMA
> machines. For a 2-socket machine (20 cores, HT enabled so 40 CPUs)
> running specjbb 2005 with one JVM per NUMA node, the patch also scaled
> reasonably well
That's way more more interesting. No idea what this thing does under
the hood thus whether it should be helped or not, but at least it's a
real deal benchmark vs a kernel hacker tool.
> specjbb
> 5.15.0-rc3 5.15.0-rc3
> vanilla sched-wakeeflips-v1r1
> Hmean tput-1 50044.48 ( 0.00%) 53969.00 * 7.84%*
> Hmean tput-2 106050.31 ( 0.00%) 113580.78 * 7.10%*
> Hmean tput-3 156701.44 ( 0.00%) 164857.00 * 5.20%*
> Hmean tput-4 196538.75 ( 0.00%) 218373.42 * 11.11%*
> Hmean tput-5 247566.16 ( 0.00%) 267173.09 * 7.92%*
> Hmean tput-6 284981.46 ( 0.00%) 311007.14 * 9.13%*
> Hmean tput-7 328882.48 ( 0.00%) 359373.89 * 9.27%*
> Hmean tput-8 366941.24 ( 0.00%) 393244.37 * 7.17%*
> Hmean tput-9 402386.74 ( 0.00%) 433010.43 * 7.61%*
> Hmean tput-10 437551.05 ( 0.00%) 475756.08 * 8.73%*
> Hmean tput-11 481349.41 ( 0.00%) 519824.54 * 7.99%*
> Hmean tput-12 533148.45 ( 0.00%) 565070.21 * 5.99%*
> Hmean tput-13 570563.97 ( 0.00%) 609499.06 * 6.82%*
> Hmean tput-14 601117.97 ( 0.00%) 647876.05 * 7.78%*
> Hmean tput-15 639096.38 ( 0.00%) 690854.46 * 8.10%*
> Hmean tput-16 682644.91 ( 0.00%) 722826.06 * 5.89%*
> Hmean tput-17 732248.96 ( 0.00%) 758805.17 * 3.63%*
> Hmean tput-18 762771.33 ( 0.00%) 791211.66 * 3.73%*
> Hmean tput-19 780582.92 ( 0.00%) 819064.19 * 4.93%*
> Hmean tput-20 812183.95 ( 0.00%) 836664.87 * 3.01%*
> Hmean tput-21 821415.48 ( 0.00%) 833734.23 ( 1.50%)
> Hmean tput-22 815457.65 ( 0.00%) 844393.98 * 3.55%*
> Hmean tput-23 819263.63 ( 0.00%) 846109.07 * 3.28%*
> Hmean tput-24 817962.95 ( 0.00%) 839682.92 * 2.66%*
> Hmean tput-25 807814.64 ( 0.00%) 841826.52 * 4.21%*
> Hmean tput-26 811755.89 ( 0.00%) 838543.08 * 3.30%*
> Hmean tput-27 799341.75 ( 0.00%) 833487.26 * 4.27%*
> Hmean tput-28 803434.89 ( 0.00%) 829022.50 * 3.18%*
> Hmean tput-29 803233.25 ( 0.00%) 826622.37 * 2.91%*
> Hmean tput-30 800465.12 ( 0.00%) 824347.42 * 2.98%*
> Hmean tput-31 791284.39 ( 0.00%) 791575.67 ( 0.04%)
> Hmean tput-32 781930.07 ( 0.00%) 805725.80 ( 3.04%)
> Hmean tput-33 785194.31 ( 0.00%) 804795.44 ( 2.50%)
> Hmean tput-34 781325.67 ( 0.00%) 800067.53 ( 2.40%)
> Hmean tput-35 777715.92 ( 0.00%) 753926.32 ( -3.06%)
> Hmean tput-36 770516.85 ( 0.00%) 783328.32 ( 1.66%)
> Hmean tput-37 758067.26 ( 0.00%) 772243.18 * 1.87%*
> Hmean tput-38 764815.45 ( 0.00%) 769156.32 ( 0.57%)
> Hmean tput-39 757885.41 ( 0.00%) 757670.59 ( -0.03%)
> Hmean tput-40 750140.15 ( 0.00%) 760739.13 ( 1.41%)
>
> The largest regression was within noise. Most results were outside the
> noise.
>
> Some HPC workloads showed little difference but they do not communicate
> that heavily. redis microbenchmark showed mostly neutral results.
> schbench (facebook simulator workload that is latency sensitive) showed a
> mix of results, but helped more than it hurt. Even the machine with the
> worst results for schbench showed improved wakeup latencies at the 99th
> percentile. These were all on NUMA machines.
>