Re: hackbench regression due to commit 9dfc6e68bfe6e

From: Pekka Enberg
Date: Wed Apr 07 2010 - 12:53:09 EST

Pekka Enberg wrote:
Christoph Lameter wrote:
I wonder if this is not related to the kmem_cache_cpu structure straggling
cache line boundaries under some conditions. On 2.6.33 the kmem_cache_cpu
structure was larger and therefore tight packing resulted in different

Could you see how the following patch affects the results. It attempts to
increase the size of kmem_cache_cpu to a power of 2 bytes. There is also
the potential that other per cpu fetches to neighboring objects affect the
situation. We could cacheline align the whole thing.

include/linux/slub_def.h | 5 +++++
1 file changed, 5 insertions(+)

Index: linux-2.6/include/linux/slub_def.h
--- linux-2.6.orig/include/linux/slub_def.h 2010-04-07 11:33:50.000000000 -0500
+++ linux-2.6/include/linux/slub_def.h 2010-04-07 11:35:18.000000000 -0500
@@ -38,6 +38,11 @@ struct kmem_cache_cpu {
void **freelist; /* Pointer to first free per cpu object */
struct page *page; /* The slab from which we are allocating */
int node; /* The node of the page (or -1 for debug) */
+#ifndef CONFIG_64BIT
+ int dummy1;
+ unsigned long dummy2;
unsigned stat[NR_SLUB_STAT_ITEMS];

Would __cacheline_aligned_in_smp do the trick here?

Oh, sorry, I think it's actually '____cacheline_aligned_in_smp' (with four underscores) for per-cpu data. Confusing...
