RE: Regression with SLUB on Netperf and Volanomark

From: Christoph Lameter
Date: Fri May 04 2007 - 14:28:18 EST


If I optimize now for the case that we do not share the cpu cache between
different cpus then performance way drop for the case in which we share
the cache (hyperthreading).

If we do not share the cache then processors essentially needs to have
their own lists of partial caches in which they keep cache hot objects.
(something mini NUMA like). Any writes to shared objects will cause
cacheline eviction on the other which is not good.

If they do share the cpu cache then they need to have a shared list of
partial slabs.

Not sure where to go here. Increasing the per cpu slab size may hold off
the issue up to a certain cpu cache size. For that we would need to
identify which slabs create the performance issue.

One easy way to check that this is indeed the case: Enable fake NUMA. You
will then have separate queues for each processor since they are on
different "nodes". Create two fake nodes. Run one thread in each node and
see if this fixes it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/