Re: HT schedulers' performance on single HT processor

From: Nathan Fredrickson
Date: Mon Dec 15 2003 - 19:18:21 EST

Next message: Rik van Riel: "Re: 2.6.0-test9 - poor swap performance on low end machines"
Previous message: Zwane Mwaikambo: "Re: [CFT][RFC] HT scheduler"
In reply to: Con Kolivas: "Re: HT schedulers' performance on single HT processor"
Next in thread: Con Kolivas: "Re: HT schedulers' performance on single HT processor"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, 2003-12-15 at 05:11, Con Kolivas wrote:
> On Mon, 15 Dec 2003 06:49, Nathan Fredrickson wrote:
> > I can also run the same on four physical processors if there is
> > interest.
>
> The specific HT scheduler benefits only start appearing with more physical
> cpus which is to be expected. Just for demonstration the four processor run
> would be nice (and obviously take you less time to do ;). I think it will
> demonstrate it even more. It would be nice to help the most common case of
> one HT cpu, though, instead of hindering it.

Here are some results on four physical processors. Unfortunately my
quad systems are a different speed than the dual systems used for the
previous tests so the results are not directly comparable.

Same test as before, a 2.6.0 kernel compile with make -jX vmlinux.
Results are the best real time out of five runs.
Hardware: Xeon HT 1.4GHz

Test cases:
1phys UP - UP test11 kernel with HT disabled in the BIOS
4phys SMP - SMP test11 kernel on 4 physical procs with HT disabled
4phys HT - SMP test11 kernel on 4 physical procs with HT enabled
4phys HT (w26)- same as above with Nick's w26 sched-rollup patch
4phys HT (C1) - same as above with Ingo's C1 patch

Here are the results normalized to the X=1 UP case to make comparisons
easier. Lower is better.

X = 1 2 3 4 5 6 7 8 9 16
1phys UP 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
4phys SMP 1.00 0.99 0.51 0.35 0.27 0.27 0.27 0.27 0.27 0.27
4phys HT 1.01 1.00 0.55 0.40 0.33 0.29 0.27 0.26 0.25 0.26
4phys HT(w26) 1.01 1.01 0.54 0.37 0.31 0.27 0.26 0.26 0.26 0.26
4phys HT(C1) 1.01 1.00 0.52 0.36 0.29 0.28 0.27 0.26 0.25 0.26

Interesting that the overhead due to HT in the X=1 column is only 1%
with 4 physical processors. It was 1-3% before with 1 or 2 physical
processors.

In the partial load columns where there are less compiler processes than
logical CPUs (X=3,4,5,6,7), it appears that both patches are doing a
better job scheduling than the standard scheduler. At full load (X=>8)
all three HT test cases perform about equally and beat standard SMP by
1-2%.

Hope these results are helpful. I'd be happy to run more cases and/or
other patches.

Nathan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Rik van Riel: "Re: 2.6.0-test9 - poor swap performance on low end machines"
Previous message: Zwane Mwaikambo: "Re: [CFT][RFC] HT scheduler"
In reply to: Con Kolivas: "Re: HT schedulers' performance on single HT processor"
Next in thread: Con Kolivas: "Re: HT schedulers' performance on single HT processor"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]