Re: [PATCH] sched/numa: use runnable_avg to classify node

From: Mel Gorman
Date: Tue Aug 25 2020 - 09:58:47 EST


On Tue, Aug 25, 2020 at 02:18:18PM +0200, Vincent Guittot wrote:
> Use runnable_avg to classify numa node state similarly to what is done for
> normal load balancer. This helps to ensure that numa and normal balancers
> use the same view of the state of the system.
>
> - large arm64system: 2 nodes / 224 CPUs
> hackbench -l (256000/#grp) -g #grp
>
> grp tip/sched/core +patchset improvement
> 1 14,008(+/- 4,99 %) 13,800(+/- 3.88 %) 1,48 %
> 4 4,340(+/- 5.35 %) 4.283(+/- 4.85 %) 1,33 %
> 16 3,357(+/- 0.55 %) 3.359(+/- 0.54 %) -0,06 %
> 32 3,050(+/- 0.94 %) 3.039(+/- 1,06 %) 0,38 %
> 64 2.968(+/- 1,85 %) 3.006(+/- 2.92 %) -1.27 %
> 128 3,290(+/-12.61 %) 3,108(+/- 5.97 %) 5.51 %
> 256 3.235(+/- 3.95 %) 3,188(+/- 2.83 %) 1.45 %
>

Intuitively the patch makes sense but I'm not a fan of using hackbench
for evaluating NUMA balancing. The tasks are too short-lived and it's
not sensitive enough to data placement because of the small footprint
and because hackbench tends to saturate a machine.

As predicting NUMA balancing behaviour in your head can be difficult, I've
queued up a battery of tests on a few different NUMA machines and will see
what falls out. It'll take a few days as some of the tests are long-lived.

Baseline will be 5.9-rc2 as I haven't looked at the topology rework in
tip/sched/core and this patch should not be related to it.

--
Mel Gorman
SUSE Labs