Re: [PATCH 1/2] sched/fair: Couple wakee flips with heavy wakers

From: Mike Galbraith
Date: Tue Oct 26 2021 - 08:14:29 EST


On Tue, 2021-10-26 at 12:57 +0100, Mel Gorman wrote:
>
> The patch in question was also tested on other workloads on NUMA
> machines. For a 2-socket machine (20 cores, HT enabled so 40 CPUs)
> running specjbb 2005 with one JVM per NUMA node, the patch also scaled
> reasonably well

That's way more more interesting. No idea what this thing does under
the hood thus whether it should be helped or not, but at least it's a
real deal benchmark vs a kernel hacker tool.

> specjbb
>                               5.15.0-rc3             5.15.0-rc3
>                                  vanilla  sched-wakeeflips-v1r1
> Hmean     tput-1     50044.48 (   0.00%)    53969.00 *   7.84%*
> Hmean     tput-2    106050.31 (   0.00%)   113580.78 *   7.10%*
> Hmean     tput-3    156701.44 (   0.00%)   164857.00 *   5.20%*
> Hmean     tput-4    196538.75 (   0.00%)   218373.42 *  11.11%*
> Hmean     tput-5    247566.16 (   0.00%)   267173.09 *   7.92%*
> Hmean     tput-6    284981.46 (   0.00%)   311007.14 *   9.13%*
> Hmean     tput-7    328882.48 (   0.00%)   359373.89 *   9.27%*
> Hmean     tput-8    366941.24 (   0.00%)   393244.37 *   7.17%*
> Hmean     tput-9    402386.74 (   0.00%)   433010.43 *   7.61%*
> Hmean     tput-10   437551.05 (   0.00%)   475756.08 *   8.73%*
> Hmean     tput-11   481349.41 (   0.00%)   519824.54 *   7.99%*
> Hmean     tput-12   533148.45 (   0.00%)   565070.21 *   5.99%*
> Hmean     tput-13   570563.97 (   0.00%)   609499.06 *   6.82%*
> Hmean     tput-14   601117.97 (   0.00%)   647876.05 *   7.78%*
> Hmean     tput-15   639096.38 (   0.00%)   690854.46 *   8.10%*
> Hmean     tput-16   682644.91 (   0.00%)   722826.06 *   5.89%*
> Hmean     tput-17   732248.96 (   0.00%)   758805.17 *   3.63%*
> Hmean     tput-18   762771.33 (   0.00%)   791211.66 *   3.73%*
> Hmean     tput-19   780582.92 (   0.00%)   819064.19 *   4.93%*
> Hmean     tput-20   812183.95 (   0.00%)   836664.87 *   3.01%*
> Hmean     tput-21   821415.48 (   0.00%)   833734.23 (   1.50%)
> Hmean     tput-22   815457.65 (   0.00%)   844393.98 *   3.55%*
> Hmean     tput-23   819263.63 (   0.00%)   846109.07 *   3.28%*
> Hmean     tput-24   817962.95 (   0.00%)   839682.92 *   2.66%*
> Hmean     tput-25   807814.64 (   0.00%)   841826.52 *   4.21%*
> Hmean     tput-26   811755.89 (   0.00%)   838543.08 *   3.30%*
> Hmean     tput-27   799341.75 (   0.00%)   833487.26 *   4.27%*
> Hmean     tput-28   803434.89 (   0.00%)   829022.50 *   3.18%*
> Hmean     tput-29   803233.25 (   0.00%)   826622.37 *   2.91%*
> Hmean     tput-30   800465.12 (   0.00%)   824347.42 *   2.98%*
> Hmean     tput-31   791284.39 (   0.00%)   791575.67 (   0.04%)
> Hmean     tput-32   781930.07 (   0.00%)   805725.80 (   3.04%)
> Hmean     tput-33   785194.31 (   0.00%)   804795.44 (   2.50%)
> Hmean     tput-34   781325.67 (   0.00%)   800067.53 (   2.40%)
> Hmean     tput-35   777715.92 (   0.00%)   753926.32 (  -3.06%)
> Hmean     tput-36   770516.85 (   0.00%)   783328.32 (   1.66%)
> Hmean     tput-37   758067.26 (   0.00%)   772243.18 *   1.87%*
> Hmean     tput-38   764815.45 (   0.00%)   769156.32 (   0.57%)
> Hmean     tput-39   757885.41 (   0.00%)   757670.59 (  -0.03%)
> Hmean     tput-40   750140.15 (   0.00%)   760739.13 (   1.41%)
>
> The largest regression was within noise. Most results were outside the
> noise.
>
> Some HPC workloads showed little difference but they do not communicate
> that heavily. redis microbenchmark showed mostly neutral results.
> schbench (facebook simulator workload that is latency sensitive) showed a
> mix of results, but helped more than it hurt. Even the machine with the
> worst results for schbench showed improved wakeup latencies at the 99th
> percentile. These were all on NUMA machines.
>