Re: [PATCH] sched/numa: use runnable_avg to classify node
From: Vincent Guittot
Date: Fri Aug 28 2020 - 02:48:28 EST
On Thu, 27 Aug 2020 at 20:22, Mel Gorman <mgorman@xxxxxxx> wrote:
>
> On Thu, Aug 27, 2020 at 05:43:11PM +0200, Vincent Guittot wrote:
> > > The testing was a mixed bag of wins and losses but wins more than it
> > > loses. Biggest loss was a 9.04% regression on nas-SP using openmp for
> > > parallelisation on Zen1. Biggest win was around 8% gain running
> > > specjbb2005 on Zen2 (with some major gains of up to 55% for some thread
> > > counts). Most workloads were stable across multiple Intel and AMD
> > > machines.
> > >
> > > There were some oddities in changes in NUMA scanning rate but that is
> > > likely a side-effect because the locality over time for the same loads
> > > did not look obviously worse. There was no negative result I could point
> > > at that was not offset by a positive result elsewhere. Given it's not
> > > a univeral win or loss, matching numa and lb balancing as closely as
> > > possible is best so
> > >
> > > Reviewed-by: Mel Gorman <mgorman@xxxxxxx>
> >
> > Thanks.
> >
> > I will try to reproduce the nas-SP test on my setup to see what is going one
> >
>
> You can try but you might be chasing ghosts. Please note that this nas-SP
> observation was only on zen1 and only for C-class and OMP. The other
> machines tested for the same class and OMP were fine (including zen2). Even
> D-class on the same machine with OMP was fine as was MPI in both cases. The
> bad result indicated that NUMA scanning and faulting was higher but that
> is more likely to be a problem with NUMA balancing than your patch.
>
> In the five iterations, two iterations showed a large spike in scan rate
> towards the end of an iteration but not the other three. The scan rate
> was also not consistently high so there is a degree of luck involved with
> SP specifically and there is not a consistently penalty as a result of
> your patch.
>
> The only thing to be aware of is that this patch might show up in
> bisections once it's merged for both performance gains and losses.
Thanks for the detailed explanation. I will save my time and continue
on the fairness problem in this case.
Vincent
>
> --
> Mel Gorman
> SUSE Labs