Re: [PATCH 00/13] Reconcile NUMA balancing decisions with the load balancer v3
From: Vincent Guittot
Date: Mon Feb 17 2020 - 11:15:49 EST
On Mon, 17 Feb 2020 at 16:14, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Feb 17, 2020 at 02:49:11PM +0100, Vincent Guittot wrote:
> > On Mon, 17 Feb 2020 at 11:44, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > Changelog since V2:
> > > o Rebase on top of Vincent's series again
> > > o Fix a missed rcu_read_unlock
> > > o Reduce overhead of tracepoint
> > >
> > > Changelog since V1:
> > > o Rebase on top of Vincent's series and rework
> > >
> > > Note: The baseline for this series is tip/sched/core as of February
> > > 12th rebased on top of v5.6-rc1. The series includes patches from
> > > Vincent as I needed to add a fix and build on top of it. Vincent's
> > > series on its own introduces performance regressions for *some*
> > > but not *all* machines so it's easily missed. This series overall
> > > is close to performance-neutral with some gains depending on the
> > > machine. However, the end result does less work on NUMA balancing
> > > and the fact that both the NUMA balancer and load balancer uses
> > > similar logic makes it much easier to understand.
> > >
> > > The NUMA balancer makes placement decisions on tasks that partially
> > > take the load balancer into account and vice versa but there are
> > > inconsistencies. This can result in placement decisions that override
> > > each other leading to unnecessary migrations -- both task placement
> > > and page placement. This series reconciles many of the decisions --
> > > partially Vincent's work with some fixes and optimisations on top to
> > > merge our two series.
> > >
> > > The first patch is unrelated. It's picked up by tip but was not present in
> > > the tree at the time of the fork. I'm including it here because I tested
> > > with it.
> > >
> > > The second and third patches are tracing only and was needed to get
> > > sensible data out of ftrace with respect to task placement for NUMA
> > > balancing. The NUMA balancer is *far* easier to analyse with the
> > > patches and informed how the series should be developed.
> > >
> > > Patches 4-5 are Vincent's and use very similar code patterns and logic
> > > between NUMA and load balancer. Patch 6 is a fix to Vincent's work that
> > > is necessary to avoid serious imbalances being introduced by the NUMA
> >
> > Yes the test added in load_too_imbalanced() by patch 5 doesn't seem to
> > be a good choice.
>
> But it *did* make sense intuitively!
Yes. In fact, one difference compared to your fix is that
load_too_imbalance() is also called by task_numa_compare() whereas
node_type only is only tested in task_numa_find_cpu() in your patch
>
> > I haven't remove it as it was done by your patch 6 but it might worth
> > removing it directly if a new version is needed
> >
>
> They could be folded together or part folded together but I did not see
> much value in that. I felt that keeping them seperate both preserved the
> development history and acted as a historical reference on why using a
> spare CPU can be hazardous. I do not believe it is a bisection hazard
> as performance is roughly equivalent before and after the series (so
> far at least). LKP might trip up on it and if so, we'll simply ask for
> confirmation that patch 6 fixes it.
that's fine for me
>
> --
> Mel Gorman
> SUSE Labs