Re: [tbench regression fixes]: digging out smelly deadmen.

From: Mike Galbraith
Date: Sat Oct 11 2008 - 14:14:36 EST


On Sat, 2008-10-11 at 16:39 +0200, Peter Zijlstra wrote:

> That said, we can probably still avoid the division for the top level
> stuff, because the sum of the top level weights is still invariant
> between all tasks.

Less math would be nice of course...

> I'll have a stab at doing so... I initially didn't do this because my
> first try gave some real ugly code, but we'll see - these numbers are a
> very convincing reason to try again.

...but the numbers I get on Q6600 don't pin the tail on the math donkey.

Update to UP test log.

2.6.27-final-up
ring-test - 1.193 us/cycle = 838 KHz (gcc-4.3)
tbench - 337.377 MB/sec tso/gso on
tbench - 340.362 MB/sec tso/gso off
netperf - 120751.30 rr/s tso/gso on
netperf - 121293.48 rr/s tso/gso off

2.6.27-final-up
patches/revert_weight_and_asym_stuff.diff
ring-test - 1.133 us/cycle = 882 KHz (gcc-4.3)
tbench - 340.481 MB/sec tso/gso on
tbench - 343.472 MB/sec tso/gso off
netperf - 119486.14 rr/s tso/gso on
netperf - 121035.56 rr/s tso/gso off

2.6.28-up
ring-test - 1.149 us/cycle = 870 KHz (gcc-4.3)
tbench - 343.681 MB/sec tso/gso off
netperf - 122812.54 rr/s tso/gso off

My SMP log, updated to account for TSO/GSO monkey-wrench.

(<bleep> truckload of time <bleep> wasted chasing unbisectable
<bleepity-bleep> tso gizmo. <bleep!>)

SMP config, same as UP kernels tested, except SMP.

tbench -t 60 4 localhost followed by four 60 sec netperf
TCP_RR pairs, each pair on it's own core of my Q6600.

2.6.22.19

Throughput 1250.73 MB/sec 4 procs 1.00

16384 87380 1 1 60.01 111272.55 1.00
16384 87380 1 1 60.00 104689.58
16384 87380 1 1 60.00 110733.05
16384 87380 1 1 60.00 110748.88

2.6.22.19-cfs-v24.1

Throughput 1213.21 MB/sec 4 procs .970

16384 87380 1 1 60.01 108569.27 .992
16384 87380 1 1 60.01 108541.04
16384 87380 1 1 60.00 108579.63
16384 87380 1 1 60.01 108519.09

2.6.23.17

Throughput 1200.46 MB/sec 4 procs .959

16384 87380 1 1 60.01 95987.66 .866
16384 87380 1 1 60.01 92819.98
16384 87380 1 1 60.01 95454.00
16384 87380 1 1 60.01 94834.84

2.6.23.17-cfs-v24.1

Throughput 1238.68 MB/sec 4 procs .990

16384 87380 1 1 60.01 105871.52 .969
16384 87380 1 1 60.01 105813.11
16384 87380 1 1 60.01 106106.31
16384 87380 1 1 60.01 106310.20

2.6.24.7

Throughput 1204 MB/sec 4 procs .962

16384 87380 1 1 60.00 99599.27 .910
16384 87380 1 1 60.00 99439.95
16384 87380 1 1 60.00 99556.38
16384 87380 1 1 60.00 99500.45

2.6.25.17

Throughput 1223.16 MB/sec 4 procs .977
16384 87380 1 1 60.00 101768.95 .930
16384 87380 1 1 60.00 101888.46
16384 87380 1 1 60.01 101608.21
16384 87380 1 1 60.01 101833.05

2.6.26.5

Throughput 1183.47 MB/sec 4 procs .945

16384 87380 1 1 60.00 100837.12 .922
16384 87380 1 1 60.00 101230.12
16384 87380 1 1 60.00 100868.45
16384 87380 1 1 60.00 100491.41

numbers above here are gcc-4.1, below gcc-4.3

2.6.26.6

Throughput 1177.18 MB/sec 4 procs

16384 87380 1 1 60.00 100896.10
16384 87380 1 1 60.00 100028.16
16384 87380 1 1 60.00 101729.44
16384 87380 1 1 60.01 100341.26

TSO/GSO off

2.6.27-final

Throughput 1177.39 MB/sec 4 procs

16384 87380 1 1 60.00 98830.65
16384 87380 1 1 60.00 98722.47
16384 87380 1 1 60.00 98565.17
16384 87380 1 1 60.00 98633.03

2.6.27-final
patches/revert_weight_and_asym_stuff.diff

Throughput 1167.67 MB/sec 4 procs

16384 87380 1 1 60.00 97003.05
16384 87380 1 1 60.00 96758.42
16384 87380 1 1 60.00 96432.01
16384 87380 1 1 60.00 97060.98

2.6.28.git

Throughput 1173.14 MB/sec 4 procs

16384 87380 1 1 60.00 98449.33
16384 87380 1 1 60.00 98484.92
16384 87380 1 1 60.00 98657.98
16384 87380 1 1 60.00 98467.39



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/