Re: [PATCH] sched/numa: use runnable_avg to classify node

From: Mel Gorman
Date: Thu Aug 27 2020 - 11:35:39 EST


On Tue, Aug 25, 2020 at 02:18:18PM +0200, Vincent Guittot wrote:
> Use runnable_avg to classify numa node state similarly to what is done for
> normal load balancer. This helps to ensure that numa and normal balancers
> use the same view of the state of the system.
>
> - large arm64system: 2 nodes / 224 CPUs
> hackbench -l (256000/#grp) -g #grp
>
> grp tip/sched/core +patchset improvement
> 1 14,008(+/- 4,99 %) 13,800(+/- 3.88 %) 1,48 %
> 4 4,340(+/- 5.35 %) 4.283(+/- 4.85 %) 1,33 %
> 16 3,357(+/- 0.55 %) 3.359(+/- 0.54 %) -0,06 %
> 32 3,050(+/- 0.94 %) 3.039(+/- 1,06 %) 0,38 %
> 64 2.968(+/- 1,85 %) 3.006(+/- 2.92 %) -1.27 %
> 128 3,290(+/-12.61 %) 3,108(+/- 5.97 %) 5.51 %
> 256 3.235(+/- 3.95 %) 3,188(+/- 2.83 %) 1.45 %
>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>

The testing was a mixed bag of wins and losses but wins more than it
loses. Biggest loss was a 9.04% regression on nas-SP using openmp for
parallelisation on Zen1. Biggest win was around 8% gain running
specjbb2005 on Zen2 (with some major gains of up to 55% for some thread
counts). Most workloads were stable across multiple Intel and AMD
machines.

There were some oddities in changes in NUMA scanning rate but that is
likely a side-effect because the locality over time for the same loads
did not look obviously worse. There was no negative result I could point
at that was not offset by a positive result elsewhere. Given it's not
a univeral win or loss, matching numa and lb balancing as closely as
possible is best so

Reviewed-by: Mel Gorman <mgorman@xxxxxxx>

Thanks.

--
Mel Gorman
SUSE Labs