first of all, please excuse this off-topic posting. I'm doing a test
with apache on a 4CPU-Box (Siemens Primergy 870) with Linux/apache and
NT/IIS. Now I got some *very* strange results and want to make sure,
that you -- the developers -- have the occasion to comment on them,
*before* I publish them.
If there's a better way to do this, please feel free to tell me.
So now to the facts:
I've got a Siemens Primergy 870 with:
4 CPUs PII Xeon 450 MHz
2 GByte RAM
Mylex RAID Controler DAC 960 (64MB RAM), RAID5
2 x Intel EEpro 100 (only one is used, the other is not ifconfig'ed but
detected)
Linux 2.2.8 (SuSE 6.1., glibc based)
Apache 1.3.6
(NT 4.0 SP4, IIS 4.0)
Network: switched 100 MBit/s, half duplex
Neither system is intensivly tuned -- apart from the obvious stuff, like
disabling not needed modules, turning off hostname lookups, ...
I'm measuring plain HTTP-GET on a static html-file with 8
(Linux-)clients each running up to 64 processes (for a total of 512)
doing HTTP-GET-requests in a tight loop. All files come from a partition
on the RAID-array (RAID 5) which is used for logging, too. File size is
4KByte (I measured 1k and 8k with similar results).
Results:
-- with one CPU both NT/IIS and Linux/apache deliver about the same
perfomance (+- 10%). Even Linux-SMP and Non-SMP only differ by < 10%.
-- with 4 CPUs NT/IIS gives sligthly more Reqeusts/sec (< 10%): that's
mainly because my setup doesn't support real heavy load by doing plain
HTTP-GETs. One CPU obviously is sufficient for this task here (as in
most real life setups with plain HTML serving)
-- Linux with 4 CPUs is disastrous: It delivers significantly less RPS
than the single-CPU version -- about a factor of 4 ! Only at high loads
(256 procs und up) it catches up.
One other strange thing (but I still have to doublecheck this):
- Linux 4 CPUs: I get slightly better perfomance if the processes fetch
a random page (out of 10000 files, so that they still fit in the buffer
cache). All the other combinations (Linux 1 CPU, NT) are a bit slower in
comparison to fetching only one page.
For clarity I included a picture (it says more than a lot of words.
The red line is the perfomance of linux with 4 CPUs, blue with random
files, green Linux one CPU, pink is NT 1 CPU (4 CPUS are slightly above
the linux curve). The drop at 512 processes is due to connect()-failures
-- I'm currently investigating this too.
Do you have any ideas, what's happening there?
Or even better, how to fix this?
It seems to me, that this might exactly be the factor 4 those mindcraft
people have measured. Do you think, this is possible?
bye, juergen
BTW: In another test with cgi-scripts I set NR_OPEN in the linux kernel
from 1024 to 2048 (in include/linux/limits.h, fs.h and posix_types.h)
and recompiled apache.
So I got rid of the open() errors that occured under heavy load. But
therefore I get only about half the rps with <=128 processes. Did I
forget something?
PS: Please CC me per mail, as I have not subscribed to the list.
-- Juergen Schmidt Redakteur/editor c't magazin PGP-Key available Verlag Heinz Heise GmbH & Co KG, Helstorferstr. 7, D-30625 Hannover EMail: ju@ct.heise.de - Tel.: +49 511 5352 300 - FAX: +49 511 5352 417- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/