[PATCH v2 0/4] Revisit NUMA imbalance tolerance and fork balancing

From: Mel Gorman
Date: Thu Nov 19 2020 - 03:30:40 EST


Changelog since v1
o Split out patch that moves imbalance calculation
o Strongly connect fork imbalance considerations with adjust_numa_imbalance

When NUMA and CPU balancing were reconciled, there was an attempt to allow
a degree of imbalance but it caused more problems than it solved. Instead,
imbalance was only allowed with an almost idle NUMA domain. A lot of the
problems have since been addressed so it's time for a revisit. There is
also an issue with how fork is balanced across threads. It's mentioned
in this context as patch 3 and 4 should share similar behaviour in terms
of a nodes utilisation.

Patch 1 is just a cosmetic rename

Patch 2 moves an imbalance calculation. It is both a micro-optimisation
and avoids confusing what imbalance means for different group
types.

Patch 3 allows a "floating" imbalance to exist so communicating tasks can
remain on the same domain until utilisation is higher. It aims
to balance compute availability with memory bandwidth.

Patch 4 is the interesting one. Currently fork can allow a NUMA node
to be completely utilised as long as there are idle CPUs until
the load balancer gets involved. This caused serious problems
with a real workload that unfortunately I cannot share many
details about but there is a proxy reproducer.

--
2.26.2