Re: [patch 2/7] clocksource: Get rid of the hardcoded 5 secondssleep time limit

From: John Stultz
Date: Wed May 18 2011 - 20:57:43 EST


On Wed, 2011-05-18 at 21:33 +0000, Thomas Gleixner wrote:
> plain text document attachment
> (clocksource-make-shift-mult-calc-more-clever.patch)
> Slow clocksources can have a way longer sleep time than 5 seconds and
> even fast ones can easily cope with 600 seconds and still maintain
> proper accuracy.
>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> ---
> kernel/time/clocksource.c | 38 +++++++++++++++++++-------------------
> 1 file changed, 19 insertions(+), 19 deletions(-)
>
> Index: linux-2.6-tip/kernel/time/clocksource.c
> ===================================================================
> --- linux-2.6-tip.orig/kernel/time/clocksource.c
> +++ linux-2.6-tip/kernel/time/clocksource.c
> @@ -626,19 +626,6 @@ static void clocksource_enqueue(struct c
> list_add(&cs->list, entry);
> }
>
> -
> -/*
> - * Maximum time we expect to go between ticks. This includes idle
> - * tickless time. It provides the trade off between selecting a
> - * mult/shift pair that is very precise but can only handle a short
> - * period of time, vs. a mult/shift pair that can handle long periods
> - * of time but isn't as precise.
> - *
> - * This is a subsystem constant, and actual hardware limitations
> - * may override it (ie: clocksources that wrap every 3 seconds).
> - */
> -#define MAX_UPDATE_LENGTH 5 /* Seconds */
> -
> /**
> * __clocksource_updatefreq_scale - Used update clocksource with new freq
> * @t: clocksource to be registered
> @@ -652,15 +639,28 @@ static void clocksource_enqueue(struct c
> */
> void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
> {
> + unsigned long sec;
> +
> /*
> - * Ideally we want to use some of the limits used in
> - * clocksource_max_deferment, to provide a more informed
> - * MAX_UPDATE_LENGTH. But for now this just gets the
> - * register interface working properly.
> + * Calc the maximum number of seconds which we can run before
> + * wrapping around. For clocksources which have a mask > 32bit
> + * we need to limit the max sleep time to have a good
> + * conversion precision. 10 minutes is still a reasonable
> + * amount. That results in a shift value of 24 for a
> + * clocksource with mask >= 40bit and f >= 4GHz. That maps to
> + * ~ 0.06ppm granularity for NTP. We apply the same 12.5%
> + * margin as we do in clocksource_max_deferment()
> */

So, its not clear to me that the 12.5% margin really needed, since we
take it out when we calculate clocksource_max_deferment(). Although with
or without the margin I get the same mult/shift/max_idle_ns values for
the hardware I'm testing.

Another nice detail from the change: It doesn't affect clocksources that
normally wrap quickly:

Without your patch:
--------------
JDB: hpet mult: 2684354560 shift: 26 max_idle: 83214991360
JDB: acpi_pm mult: 2343484437 shift: 23 max_idle: 4540500826
JDB: tsc mult: 1342183991 shift: 31 max_idle: 2600481483

With your patch:
---------------
JDB: hpet mult: 2684354560 shift: 26 max_idle: 83214991360
JDB: acpi_pm mult: 2343484437 shift: 23 max_idle: 4540500826
JDB: tsc mult: 10485812 shift: 24 max_idle: 332861616128

Although interestingly 332 secs calculated for the max idle is more then
12% off of 600.

> + sec = (cs->mask - (cs->mask >> 5));
> + do_div(sec, freq);
> + do_div(sec, scale);
> + if (!sec)
> + sec = 1;
> + else if (sec > 600 && cs->mask > UINT_MAX)
> + sec = 600;
> +
> clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
> - NSEC_PER_SEC/scale,
> - MAX_UPDATE_LENGTH*scale);
> + NSEC_PER_SEC / scale, sec * scale);
> cs->max_idle_ns = clocksource_max_deferment(cs);
> }
> EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);

Overall it looks good. I'm doing some NTP testing with the TSC shift
change to make sure we don't get any odd side effects. I'll let you know
how those go.

thanks
-john



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/