[patch] CFS (Completely Fair Scheduler), v2

From: Ingo Molnar
Date: Mon Apr 16 2007 - 18:08:12 EST



this is the second release of the CFS (Completely Fair Scheduler)
patchset, against v2.6.21-rc7:

http://redhat.com/~mingo/cfs-scheduler/sched-cfs-v2.patch

i'd like to thank everyone for the tremendous amount of feedback and
testing the v1 patch got - i could hardly keep up with just reading the
mails! Some of the stuff people addressed i couldnt implement yet, i
mostly concentrated on bugs, regressions and debuggability.

there's a fair amount of churn:

15 files changed, 456 insertions(+), 241 deletions(-)

But it's an encouraging sign that there was no crash bug found in v1,
all the bugs were related to scheduling-behavior details. The code was
tested on 3 architectures so far: i686, x86_64 and ia64. Most of the
code size increase in -v2 is due to debugging helpers, they'll be
removed later. (The new /proc/sched_debug file can be used to see the
fine details of CFS scheduling.)

Changes since -v1:

- make nice levels less starvable. (reported by Willy Tarreau)

- fixed child-runs first. A /proc/sys/kernel/sched_child_runs_first
flag can be used to turn it on/off. (This might fix the Kaffeine bug
reported by S.ÃaÄlar Onur <)

- changed SCHED_FAIR back to SCHED_NORMAL (suggested by Con Kolivas)

- UP build fix. (reported by Gabriel C)

- timer tick micro-optimization (Dmitry Adamushko)

- preemption fix: sched_class->check_preempt_curr method to decide
whether to preempt after a wakeup (or at a timer tick). (Found via a
fairness-test-utility written for CFS by Mike Galbraith)

- start forked children with neutral statistics instead of trying to
inherit them from the parent: Willy Tarreau reported that this
results in better behavior on extreme workloads, and it also
simplifies the code quite nicely. Removed sched_exit() and the
->task_exit() methods.

- make nice levels independent of the sched_granularity value

- new /proc/sched_debug file listing runqueue details and the rbtree

- new SCH-* fields in /proc/<NR>/status to see scheduling details

- new cpu-hog feature (off by default) and sysctl tunable to set it:
/proc/sys/kernel/sched_max_hog_history_ns tunable defaults to
0 (off). Positive values are meant the maximum 'memory' that the
scheduler has of CPU hogs.

- various code cleanups

- added more statistics temporarily: sum_exec_runtime,
sum_wait_runtime.

- added -CFS-v2 to EXTRAVERSION

as usual, any sort of feedback, bugreports, fixes and suggestions are
more than welcome,

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/