Re: [PATCH 2/5] introduce CLOCK_MONOTONIC_RAW

From: john stultz
Date: Mon Mar 17 2008 - 22:04:20 EST



On Sat, 2008-03-15 at 05:50 +0100, Roman Zippel wrote:
> Hi,
>
> On Fri, 14 Mar 2008, john stultz wrote:
>
> > My solution is to introduce CLOCK_MONOTONIC_RAW. This exposes a
> > nanosecond based time value, that increments starting at bootup and has
> > no frequency adjustments made to it what so ever.
>
> A general comment: the raw clock doesn't need any adjustments, so updates
> don't have to be done that frequently and you can move most of that logic
> into second_overflow().

You're suggesting adding MONTONIC_RAW code to ntp.c ? That doesn't make
much sense to me (the whole point is its not ntp adjusted). Even if you
mean just inside the "if (clock->xtime_nsec >= (u64)NSEC_PER_SEC ..."
conditional, then that means we don't get to leverage the
cycles_interval, and cycle_last, values, so all of that would have to be
duplicated and managed. Which I'm not sure it would save us much more
then the extra add and compare here.


> > @@ -439,6 +475,7 @@ static void clocksource_adjust(s64 offset)
> > void update_wall_time(void)
> > {
> > cycle_t offset;
> > + static u64 raw_snsec; /* shifted raw nanosecnds */
> >
> > /* Make sure we're fully resumed: */
> > if (unlikely(timekeeping_suspended))
>
> IMO that's really a clock property, so this belongs in the clock
> structure.
> (Some day we may want to have multiple active clocks for various purposes
> and thus export multiple raw clocks.)

I disagree. I think that crufts up the clocksource structure (which is
ideally just a simple hw counter abstraction), with timekeeping state.

I'm still not sold on the multiple clocks with multiple notions of time
idea you keep on bringing up. But if/when we cross that bridge, maybe it
would be better to add a timekeeping_clock mid-layer abstraction that
keeps the clocksource specific timekeeping state. That way we don't add
lots of complexity for the clocksource driver writers to deal with and
we allow the clocksources to be better re-purposed (for maybe more sane
things like performance counters) without getting too bloated.

> > @@ -466,6 +504,11 @@ void update_wall_time(void)
> > second_overflow();
> > }
> >
> > + if (raw_snsec >= (u64)NSEC_PER_SEC << clock->shift) {
> > + raw_snsec -= (u64)NSEC_PER_SEC << clock->shift;
> > + monotonic_raw.tv_sec++;
> > + }
> > +
>
> You don't really have to use clock->shift as a scale, thus simplifying the
> shifting here (e.g. by using NTP_SCALE_SHIFT).
> I'm thinking about doing this for e.g. xtime_nsec as well.

Could you explain this more?

Given the clocksources have different shift values, why should we not
keep the extra resolution?

How does adding the extra shifting to convert from the clocksource scale
to the NTP_SCALE_SHIFT value simplify the shifting?

thanks
-john


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/