Re: [Announce] [patch] Modular Scheduler Core and Completely FairScheduler [CFS]

From: James Bruce
Date: Wed Apr 18 2007 - 03:29:34 EST


Matt Mackall wrote:
On Tue, Apr 17, 2007 at 03:59:02PM -0700, William Lee Irwin III wrote:
On Tue, Apr 17, 2007 at 03:32:56PM -0700, William Lee Irwin III wrote:
I'm working with the following suggestion:

On Tue, Apr 17, 2007 at 09:07:49AM -0400, James Bruce wrote:
Nonlinear is a must IMO. I would suggest X = exp(ln(10)/10) ~= 1.2589
That value has the property that a nice=10 task gets 1/10th the cpu of a
nice=0 task, and a nice=20 task gets 1/100 of nice=0. I think that
would be fairly easy to explain to admins and users so that they can
know what to expect from nicing tasks.
I'm not likely to write the testcase until this upcoming weekend, though.

So that means there's a 10000:1 ratio between nice 20 and nice -19. In
that sort of dynamic range, you're likely to have non-trivial
numerical accuracy issues in integer/fixed-point math.

Well, you *are* specifying vastly different priorities. The question is how many nice=20 tasks should it take to interfere with a nice=-19 task? If you've only got a 100:1 ratio, 100 nice=20 tasks will take ~50% of the CPU away from a nice=-19 task. I don't think that's ideal, as in my mind a -19 task shouldn't have to care how many nice=20 tasks there are (within reason). IMHO, if a user is running a CPU hog at nice=-19, and expecting nice=20 tasks to run immediately, I don't think the scheduler is the problem.

(Especially if your clock is jiffies-scale, which a significant number
of machines will continue to be.)

I really think if we want to have vastly different ratios, we probably
want to be looking at BATCH and RT scheduling classes instead.

I, like all users, can live with anything, but there should be a clear specification of what the user should expect. Magic changes in the function at nice=0, or no real clear meaning at all (mainline), are both things that don't help the users to figure that out. I like the exponential base because shifting all tasks up or down one nice level does not change the relative cpu distribution (i.e. two tasks {nice=-5,nice=0} get the same relative cpu distribution as if they were {nice=0,nice=5}. An exponential base is the only way that property can hold.

Now, perhaps implementation issues may prevent something like the "1.2589" ratio rule from being realized, but I'm not sure we should throw it out _before_ we know that it's actually a problem. This is the same sort of resistance that the timekeeping code updates faces (using nanoseconds everywhere instead of "natural" clock bases), but that got addressed eventually.

- Jim Bruce

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/