hackbench [pthread mode] regression with 2.6.29-rc3

From: Zhang, Yanmin
Date: Sun Feb 01 2009 - 02:30:58 EST


Comparing with 2.6.29-rc2's result, hackbench [pthread mode] result is increased about
50%~100% with 2.6.29-rc3 on my 4 qual-core process tigerton machine and 4 qual-core
Montvale Itanium mahchine. The smaller result, the better performance.
ï
Command to run hackbench:
#./hackbench 100 thread 2000

Bisect located below patch.
commit 490dea45d00f01847ebebd007685d564aaf2cd98
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Mon Nov 24 17:06:57 2008 +0100

itimers: remove the per-cpu-ish-ness

Either we bounce once cacheline per cpu per tick, yielding n^2 bounces
or we just bounce a single..

Also, using per-cpu allocations for the thread-groups complicates the
per-cpu allocator in that its currently aimed to be a fixed sized
allocator and the only possible extention to that would be vmap based,
which is seriously constrained on 32 bit archs.


After above patch is reverted, hackbench result is restored.

yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/