Re: [PATCH] sched: Support current clocksource handling infallback sched_clock().

From: john stultz
Date: Tue May 26 2009 - 19:00:52 EST

On Tue, 2009-05-26 at 22:55 +0200, Peter Zijlstra wrote:
> On Tue, 2009-05-26 at 13:40 -0700, john stultz wrote:
> > On Tue, 2009-05-26 at 22:30 +0200, Peter Zijlstra wrote:
> > > On Tue, 2009-05-26 at 13:23 -0700, john stultz wrote:
> > > > Overall, I'd probably suggest thinking this through a bit more. At some
> > > > point doing this right will cause sched_clock() to be basically the same
> > > > as ktime_get(). So why not just use that instead of remaking it?
> > >
> > > simply because we don't require the strict global monotonicy for
> > > scheduling as we do from a regular time source (its nice to have
> > > though).
> > >
> > > That means that on x86 we can always use TSC for sched_clock(), even
> > > when its quite unsuitable for ktime.
> >
> > Right, but I guess what I'm asking is can this be a bit better defined?
> >
> > If we are going to use clocksources (or cyclecounters - an area I need
> > to clean up soon), it would be good to get an idea of what is expected
> > of the sched_clock() interface.
> >
> > So TSC good, HPET bad. Why?
> Because TSC is a few cycles to read, and you can factorize a largish
> prime while doing an HPET read :-)
> > Is latency all we care about? How bad would
> > the TSC have to be before we wouldn't want to use it?
> Anything better than jiffies ;-)

Except HPET thought, right? :)

> For sched_clock() we want something high-res that is monotonic per cpu
> and has a bounded drift between cpus in the order of jiffies.
> Look at kernel/sched_clock.c for what we do to make really shitty TSC
> conform to the above requirements.

Sure, I guess what I'm trying to pull out here is that should we try to
create some OK_FOR_SCHED_CLOCK flag for clocksources, and then we try to
make this generic so other arches can add that flag and be done, what is
the guidance we want to give to arch maintainers for setting that flag?

1) Has to be very very fast. Can we put a number on this? 50ns to read?

2) How long does it have to be monotonic for? Is it ok if it wraps every
few seconds?

If get_cycles() || jiffies is what we want, then lets leave it there. I
just want to avoid mixing the clocksource code into the sched clock code
until we really get this sort of definition sorted.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at