Re: [slub p4 0/7] slub: per cpu partial lists V4

From: Pekka Enberg
Date: Mon Aug 15 2011 - 04:44:25 EST


On 8/13/11 9:28 PM, David Rientjes wrote:
On Tue, 9 Aug 2011, Christoph Lameter wrote:

The following patchset introduces per cpu partial lists which allow
a performance increase of around ~10-20% with hackbench on my Sandybridge
processor.

These lists help to avoid per node locking overhead. Allocator latency
could be further reduced by making these operations work without
disabling interrupts (like the fastpath and the free slowpath) but that
is another project.

It is interesting to note that BSD has gone to a scheme with partial
pages only per cpu (source: Adrian). Transfer of cpu ownerships is
done using IPIs. Probably too much overhead for our taste. The approach
here keeps the per node partial lists essentially meaning the "pages"
in there have no cpu owner.


I'm currently 35,000 feet above Chicago going about 611 mph, so what
better time to benchmark this patchset on my netperf testing rack!

threads before after
16 78031 74714 (-4.3%)
32 118269 115810 (-2.1%)
48 150787 150165 (-0.4%)
64 189932 187766 (-1.1%)
80 221189 223682 (+1.1%)
96 239807 246222 (+2.7%)
112 262135 271329 (+3.5%)
128 273612 286782 (+4.8%)
144 280009 293943 (+5.0%)
160 285972 299798 (+4.8%)

I'll review the patchset in detail, especially the cleanups and
optimizations, when my wifi isn't so sketchy.

Andi, it'd be interesting to know your results for v4 of this patchset. I'm hoping to get the patches reviewed and merged to linux-next this week.

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/