Re: 2.2.1 scheduler behavior

Neil Conway (
Fri, 19 Feb 1999 11:27:48 +0000

Claude Gamache wrote:
> Hi,
> We have an application made up of 15 processes that interact through
> shared memories and sockets. Each process computes for a few
> milliseconds and then pass its data to the next one (a pipeline
> computation scheme). When a process is done with its computation, it
> calls pause() to release the CPU, it resumes its computation when its
> timer interrupt wakes it every 10 milliseconds, if the computation
> were not completed, we log this as an overrun.

Sounds like a truly strange application...

> With Kernel 2.0.36 on a Dell dual PII 400 MHz with 256 Mb RAM all
> process are able to compute within there allocated 10 ms time frame.
> With Kernel 2.2.1 on a Dell dual PII 450 MHz with 256 Mb RAM the
> processes keep overrunning ! (same configuration, same amount of
> computation). With Kernel 2.0.36 on the Dell dual PII 450 MHz, things
> are normal (no overruns).
> With Kernel 2.2.1, if I change the scheduling policy to FIFO with a
> priority of 90 to 95, then things improve (there are less
> overruns). But I observed something strange. If I use the Real Time
> Clock (/dev/rtc) at 8192 Hz (or any high "enough" frequencies), things
> improve even more, there are much less overruns for the very same
> computation.

I can't help you with this RTC weirdness I'm afraid.

> It looks as if the use of the RTC at high frequencies causes the
> Kernel scheduler to be executed more often.
> Also, with Kernel 2.2.1, I noticed with that a process has a tendency
> to keep switching from one CPU to another (that process being the only
> one using a noticeable amount of CPU according to ps and top). With
> 2.0.36, it almost never switch CPU ! That could explain why
> performances seems more random with 2.2.1.
> In the scheduler, is the PROC_CHANGE_PENALTY is large enough, is it
> something else ?

PROC_CHANGE_PENALTY is something that I've looked at, in fact I'm the
person who had it *reduced* in the late 2.1 series (it was causing
problems with interactive performance because it was just big enough to
cause bad behaviour).

Basically, the scheduler isn't at its happiest doing what you want.

PROC_CHANGE_PENALTY is *huge*, but only if you are computing for a whole
timeslice and then being pre-empted, and also (this is a bit dumb) only
on a HZ=100 machine (on the Alpha it's an order of magnitude lower since
it's specified in jiffies not real-time). If a compute-bound process
swaps CPU *every* timeslice on an x86, big deal - how long can it
possibly take on say a PII-512kB cache to refill the L2 cache
completely? A damn small percentage of 200ms I'll bet. My guess is
more like 5ms and that's being really generous. Older CPU's have
smaller L2 caches and probably shared ones anyway so it won't be very
important for them either.

Doesn't 150ms for PROC_CHANGE_PENALTY sound big enough to you? I'd chop
it down to <=5 jiffies personally :-)

On the other hand, if you are doing weird stuff like you describe ;-)),
then suddenly you may have a much larger portion of your timeslice
wasted with cache misses... But upping PROC_CHANGE_PENALTY probably
still won't help you there, since it's such a global thing. You may
want to tweak the scheduler a little.

By all means though recompile with a bigger PROC_CHANGE_PENALTY and see
what happens.

Hmm, some new thoughts just came to me:
a) You should try to run it on a UP kernel and see if it works properly.

b) The problem may be not so much that the processes are being slowed
down by cache misses, but that they don't restart immediately after they
are signalled. I'm a little unclear as to whether your process operate
one-at-a-time or several-at-once. If it's one-at-a-time then maybe
cache-misses are to blame. It sounds to me though, as if you have
several in the run-queue at any one time. Maybe the run-queue isn't
being fair to some of them... Quick check: if your processes use 3ms
apiece and want to do that every 10ms, then you can have AT MOST 3.3 per
CPU to have any hope of avoiding overruns. So something about my
interpretation of your application is broken anyway...

ps: this is probably an SMP issue only.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at