Re: [patch] CFS scheduler, -v8

From: Peter Williams
Date: Tue May 08 2007 - 20:02:26 EST


Esben Nielsen wrote:


On Tue, 8 May 2007, Peter Williams wrote:

Esben Nielsen wrote:


On Sun, 6 May 2007, Linus Torvalds wrote:

> > > On Sun, 6 May 2007, Ingo Molnar wrote:
> > > > * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > > > So the _only_ valid way to handle timers is to
> > > - either not allow wrapping at all (in which case "unsigned" is > > > better,
> > > since it is bigger)
> > > - or use wrapping explicitly, and use unsigned arithmetic (which is
> > > well-defined in C) and do something like "(long)(a-b) > 0".
> > > > hm, there is a corner-case in CFS where a fix like this is necessary.
> > > > CFS uses 64-bit values for almost everything, and the majority of > > values
> > are of 'relative' nature with no danger of overflow. (They are signed
> > because they are relative values that center around zero and can be
> > negative or positive.)
> > Well, I'd like to just worry about that for a while.
> > You say there is "no danger of overflow", and I mostly agree that once
> we're talking about 64-bit values, the overflow issue simply doesn't
> exist, and furthermore the difference between 63 and 64 bits is not > really
> relevant, so there's no major reason to actively avoid signed entries.
> > So in that sense, it all sounds perfectly sane. And I'm definitely not
> sure your "292 years after bootup" worry is really worth even > considering.
>

I would hate to tell mission control for Mankind's first mission to
another
star to reboot every 200 years because "there is no need to worry about
it."

As a matter of principle an OS should never need a reboot (with exception
for upgrading). If you say you have to reboot every 200 years, why not
every 100? Every 50? .... Every 45 days (you know what I am referring to
:-) ?

There's always going to be an upper limit on the representation of time.
At least until we figure out how to implement infinity properly.

Well you need infinite memory for that :-)
But that is my point: Why go into the problem of storing absolute time when you can use relative time?

I'd reverse that and say "Why go to the bother of using relative time when you can use absolute time?". Absolute time being time since boot, of course.





> When we're really so well off that we expect the hardware and software
> stack to be stable over a hundred years, I'd start to think about issues
> like that, in the meantime, to me worrying about those kinds of issues
> just means that you're worrying about the wrong things.
> > BUT.
> > There's a fundamental reason relative timestamps are difficult and > almost
> always have overflow issues: the "long long in the future" case as an
> approximation of "infinite timeout" is almost always relevant.
> > So rather than worry about the system staying up 292 years, I'd worry
> about whether people pass in big numbers (like some MAX_S64 > approximation)
> as an approximation for "infinite", and once you have things like that,
> the "64 bits never overflows" argument is totally bogus.
> > There's a damn good reason for using only *absolute* time. The whole
> "signed values of relative time" may _sound_ good, but it really sucks > in
> subtle and horrible ways!
>

I think you are wrong here. The only place you need absolute time is a for
the clock (CLOCK_REALTIME). You waste CPU using a 64 bit
representation when you could have used a 32 bit. With a 32 bit
implementation you are forced to handle the corner cases with wrap around
and too big arguments up front. With a 64 bit you hide those problems.

As does the other method. A 32 bit signed offset with a 32 bit base is just a crude version of 64 bit absolute time.

64 bit is also relative - just over a much longer period.

Yes, relative to boot.

32 bit signed offset is relative - and you know it. But with 64 people think it is absolute and put in large values as Linus said above.

What people? Who gets to feed times into the scheduler? Isn't it just using the time as determined by the system?

With 32 bit future developers will know it is relative and code for it. And they will get their corner cases tested, because the code soon will run into those corners.



I think CFS would be best off using a 32 bit timer counting in micro
seconds. That would wrap around in 72 minuttes. But as the timers are
relative you will never be able to specify a timer larger than 36 minuttes
in the future. But 36 minuttes is redicolously long for a scheduler and a
simple test limiting time values to that value would not break anything.

Except if you're measuring sleep times. I think that you'll find lots of tasks sleep for more than 72 minutes.

I don't think those large values will be relavant. You can easily cut off sleep times at 30 min or even 1 min.

The aim is to make the code as simple as possible not add this kind of rubbish and 1 minute would be far too low.

But you need to detect that the task have indeed been sleeping 2^32+1 usec and not 1 usec. You can't do with a 32 bit timer alone. In that case you need to use a (at least) 64 bit
timer - which is needed in the OS anyways. But not internally in the
wait queue, where the repeated calculations are.


Peter
--
Peter Williams pwil3058@xxxxxxxxxxxxxx

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/