Re: [GIT PULL] Automatic NUMA Balancing V11
From: Stephen Rothwell
Date: Wed Dec 12 2012 - 16:27:34 EST
Hi,
On Wed, 12 Dec 2012 10:03:38 +0000 Mel Gorman <mgorman@xxxxxxx> wrote:
>
> This is a pull request for "Automatic NUMA Balancing V11". The list
> of changes since commit f4a75d2eb7b1e2206094b901be09adb31ba63681:
>
> Linux 3.7-rc6 (2012-11-16 17:42:40 -0800)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma.git balancenuma-v11
>
> for you to fetch changes up to 4fc3f1d66b1ef0d7b8dc11f4ff1cc510f78b37d6:
>
> mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable (2012-12-11 14:43:00 +0000)
>
> There are three implementations for NUMA balancing, this tree (balancenuma),
> numacore which has been developed in tip/master and autonuma which is in
> aa.git. In almost all respects balancenuma is the dumbest of the three
> because its main impact is on the VM side with no attempt to be smart
> about scheduling. In the interest of getting the ball rolling, it would
> be desirable to see this much merged for 3.8 with the view to building
> scheduler smarts on top and adapting the VM where required for 3.9.
>
> The most recent set of comparisons available from different people are
>
> mel: https://lkml.org/lkml/2012/12/9/108
> mingo: https://lkml.org/lkml/2012/12/7/331
> tglx: https://lkml.org/lkml/2012/12/10/437
> srikar: https://lkml.org/lkml/2012/12/10/397
>
> The results are a mixed bag. In my own tests, balancenuma does reasonably
> well. It's dumb as rocks and does not regress against mainline. On the
> other hand, Ingo's tests shows that balancenuma is incapable of converging
> for this workloads driven by perf which is bad but is potentially explained
> by the lack of scheduler smarts. Thomas' results show balancenuma improves
> on mainline but falls far short of numacore or autonuma. Srikar's results
> indicate we all suffer on a large machine with imbalanced node sizes.
>
> My own testing showed that recent numacore results have improved
> dramatically, particularly in the last week but not universally. We've
> butted heads heavily on system CPU usage and high levels of migration even
> when it shows that overall performance is better. There are also cases
> where it regresses. Of interest is that for specjbb in some configurations
> it will regress for lower numbers of warehouses and show gains for higher
> numbers which is not reported by the tool by default and sometimes missed
> in treports. Recently I reported for numacore that the JVM was crashing
> with NullPointerExceptions but currently it's unclear what the source of
> this problem is. Initially I thought it was in how numacore batch handles
> PTEs but I'm no longer think this is the case. It's possible numacore is
> just able to trigger it due to higher rates of migration.
>
> These reports were quite late in the cycle so I/we would like to start
> with this tree as it contains much of the code we can agree on and has
> not changed significantly over the last 2-3 weeks.
It has, however all been rebased from what still exists in the linux-next
tree (as part of the tip tree).
--
Cheers,
Stephen Rothwell sfr@xxxxxxxxxxxxxxxx
Attachment:
pgp00000.pgp
Description: PGP signature