I suspect this because of a behaviour that requires tuning in Apache --
the MaxSpareServers parameter. When MaxSpareServers is too high the httpd
consumes a large amount of CPU doing essentially nothing. Halving your
MaxSpareServers frequently results in the same or better performance
but far less CPU usage (it'll depend on your bottlenecks of course).
And this is in cases where all the spare children are still in RAM, and
not swapped and so on. The only explanation (other than the admittedly
poor child mgmt code in apache) I've come up with so far is that there's
a boundary beyond which the L2 cache is overrun.
So, if anyone has the time/inclination here's an experiment. Get a
recent apache. Learn why src/conf.h defines USE_FCNTL_SERIALIZED_ACCEPT
for LINUX and figure out if you can remove that define (you may need
to fix a kernel problem, I'm not sure...). Once that's done and it's
running stable then benchmark it. Now hack your linux kernel to wake
tasks in LIFO order out of accept() and benchmark again.
Report results to me please, esp. if you find out that modern linux
kernels don't need USE_FCNTL_SERIALIZED_ACCEPT since that would speed
up apache a smidgen anyhow.
Dean