Re: Why are Suns so slow?

From: Bernd Paysan (bernd.paysan@gmx.de)
Date: Sun Mar 05 2000 - 18:09:00 EST


[posted in comp.arch and comp.sys.intel, cc'ed to linux-kernel mailing
list]

ttk@goblin.flyingcroc.net
> The Linux community has made strong steps in all of these directions,
> but still has a long journey ahead.

The last discussion when someone posted a O(1) scheduler to replace the
O(n) scheduler (with a single global run-queue for all processors),
there was a huge mumble bumble on the kernel mailing list about
"introducing bloat" and "long runqueues don't happen" and "the constant
factor is important" (which was almost zero for that algorithm, since he
didn't use his improvements for very short run queues), so it's somewhat
questionable. I don't see why they have a "long journey", since the
scheduler algorithms in question are well-known text-book algorithms.

The basic principle of the BSD/QNX scheduler (and all other prioritized
O(1) schedulers) is to have a run-queue for each priority (or "goodness"
or whatever you call it - just quantisize it so that there aren't too
many queues). When descheduling a process, you remember which one was
the highest used run-queue, compute the priority of the process to
deschedule, and put it into the corresponding run-queue - first to get a
bonus for processes that are already cached in, last on sched_yield().
When scheduling, you take the first process out of the highest priority
run-queue.

SMP-variation: keep separated run-queues for each processor. Usually
schedule only within the own CPU's run-queue, and only look at other
processor's run-queues, if their highest unscheduled priority is within
some offset higher than yours. This means that for the average case, the
scheduler is also lock-free (reading the global max-priority is atomic,
and therefore doesn't need a lock - and updating is rare), and you get
CPU preferrance for free.

Since a lot of things that are done explicit in the Linux goodness()
function are done implicit here (like preferring processes that ran on
the same CPU last time or to keep the previous process running as long
as nobody has definitely higher goodness), the algorithm should have
*lower* constant factor than the current O(n) scheduler, and should be
already better if there is one process to run.

There were several patches to the Linux kernel in the past that all
featured one of those O(1) schedulers, and none of them made it into the
kernel. Linux kernel hackers: even though the BSDers are arrogant, they
are not always wrong. O(n) schedulers don't belong to high-loaded
servers. IMHO the 2.4er kernel should provide a lock-free O(1) scheduler
(well, basically), if it should run well on machines with many CPUs.

-- 
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Mar 07 2000 - 21:00:18 EST