Hello scheduler hackers,
i just realized a behaviour of the scheduler that gets me thinking...
This is my test scenario:
On an otherwise unloaded machine, i run a server process that accepts
TCP connections, and after a client has connected, just echoes all the
packets that the client sends. A single client (sitting on another
machine) connects to the server, and then continuously sends packets (of
about 1000 bytes), and reads the echo. For each packet, the client
measures the round-trip-time time it takes to send the packet and
receive the echo. If nothing else happens on the server, this times are
always very short, a few millseconds or less.
While this echoing test is running, i set the server machine under
massive CPU load by starting a load-generating process that starts a
couple of threads. All threads run an endless loop without any I/O or
other blocking.
Behavior with 2.6.27rc6: If the number of threads started in the
load-generating process is sufficiently large (> 100), the server
process seems to be stalled during the startup of the load-generator.
With 100 threads in the load generator, the client observes one or two
round-trip-times of more than 1 second during load-generator startup.
When the load generator starts 1000 threads simultanously, the client is
stalled several times, one of them lasting more than 30 seconds. However, this stalling only appears during the startup of the
load-generator. After some time, the round trip times observed by the
client settle down, and from that point on are all reasonably short
again.
I also conducted this test with older kernels:
2.4.36: With this kernel, behaviour was really weird: When the server
was loaded with >100 threads, the client was stalled again and again for
several seconds, than ran smoothly for some time, until another period
of stalling began. It looked like the scheduler screwed up periodically,
until the bubble birsted and the stalling disappeared for a few seconds.
2.6.25.16: The client is only stalled during the startup of the load
generating process. However, for really long times. With 200 threads in
the load generator, i observered stalling for more than a minute.
Thus, the current kernel seems to pass this test best, but not
perfectly. Of course one might argue (like my colleagues do) that this
test presents a completely unrealistic scenario: hundreds of threads
started at the sime time, all not blocking. However, if i think of a
'BIG' java application server, i can imagine that a similar situation
could arise. From this perspective, one might say the behaviour of the
scheduler is not optimal and should be improved. If a server does not
respond for one minute, it's clients might (reasonably but erronously in
this case) conclude that the server is severely broken.
What do you think ?
BTW, the server was equipped with a dual-core Pentium CPU 3.40GHz.