Re: numa/core regressions fixed - more testers wanted

From: Rik van Riel
Date: Tue Nov 20 2012 - 22:23:20 EST

Next message: viresh kumar: "Re: [PATCH 17/42] sh-pfc: Move driver from drivers/sh/ to drivers/pinctrl/"
Previous message: Xiao Guangrong: "Re: [PATCH 2/5] KVM: MMU: simplify mmu_set_spte"
In reply to: Andrew Theurer: "Re: numa/core regressions fixed - more testers wanted"
Next in thread: Hugh Dickins: "Re: numa/core regressions fixed - more testers wanted"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 11/20/2012 08:54 PM, Andrew Theurer wrote:

I can confirm single JVM JBB is working well for me. I see a 30%
improvement over autoNUMA. What I can't make sense of is some perf
stats (taken at 80 warehouses on 4 x WST-EX, 512GB memory):

AutoNUMA does not have native THP migration, that may explain some
of the difference.

tips numa/core:

5,429,632,865 node-loads
3,806,419,082 node-load-misses(70.1%)
2,486,756,884 node-stores
2,042,557,277 node-store-misses(82.1%)
2,878,655,372 node-prefetches
2,201,441,900 node-prefetch-misses

autoNUMA:

4,538,975,144 node-loads
2,666,374,830 node-load-misses(58.7%)
2,148,950,354 node-stores
1,682,942,931 node-store-misses(78.3%)
2,191,139,475 node-prefetches
1,633,752,109 node-prefetch-misses

The percentage of misses is higher for numa/core. I would have expected
the performance increase be due to lower "node-misses", but perhaps I am
misinterpreting the perf data.

Lack of native THP migration may be enough to explain the
performance difference, despite autonuma having better node
locality.

Next I'll work on making multi-JVM more of an improvement, and
I'll also address any incoming regression reports.

I have issues with multiple KVM VMs running either JBB or
dbench-in-tmpfs, and I suspect whatever I am seeing is similar to
whatever multi-jvm in baremetal is. What I typically see is no real
convergence of a single node for resource usage for any of the VMs. For
example, when running 8 VMs, 10 vCPUs each, a VM may have the following
resource usage:

This is an issue. I have tried understanding the new local/shared
and shared task grouping code, but have not wrapped my mind around
that code yet.

I will have to look at that code a few more times, and ask more
questions of Ingo and Peter (and maybe ask some of the same questions
again - I see that some of my comments were addressed in the next
version of the patch, but the email never got a reply).

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: viresh kumar: "Re: [PATCH 17/42] sh-pfc: Move driver from drivers/sh/ to drivers/pinctrl/"
Previous message: Xiao Guangrong: "Re: [PATCH 2/5] KVM: MMU: simplify mmu_set_spte"
In reply to: Andrew Theurer: "Re: numa/core regressions fixed - more testers wanted"
Next in thread: Hugh Dickins: "Re: numa/core regressions fixed - more testers wanted"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]