Re: [RFC] Exposing TSC "reliability" to userland

From: Venkatesh Pallipadi
Date: Tue May 04 2010 - 19:16:53 EST


On Mon, May 3, 2010 at 1:21 PM, Dan Magenheimer
<dan.magenheimer@xxxxxxxxxx> wrote:
>
> In a patch posted late last year by Venki:
>
> http://lkml.org/lkml/2009/12/17/360
>
> it was noted that some systems that specify the "Invariant TSC"
> bit in CPUID (on recent processors) are sadly not guaranteed to
> have synchronized TSCs.  As a result, Ingo's check_tsc_warp() is
> executed; if the warp test passes, the kernel uses TSC
> as clocksource and, if it doesn't pass, the kernel marks
> the TSC as unstable and chooses a different clocksource.
>
> Whether the kernel deems TSC to be reliable or not is a very
> useful piece of information to userland, e.g. to certain
> enterprise apps such the Oracle DB, some JVM's, etc.  If
> TSC IS reliable, rdtsc can be used by many of these
> enterprise applications in many situations in place of a
> gettimeofday call.  Rdtsc can be much faster even than
> a vsyscall and it is certainly much much faster when,
> for one reason or another, vsyscall is not enabled.
> This can make a huge performance difference in real
> benchmarks when timestamps are frequently taken (10%
> benchmark performance improvement was measured using
> rdtsc vs gettimeofday syscall).
>
> Running a warp test in userland is not nearly as accurate
> as the warp test run by the kernel.  So it makes sense to expose
> the results of the kernel warp test to userland, maybe
> through /sys/devices/system/clocksource/tsc_reliable
>
> Comments?

[ Sorry if this is a duplicate. I had messed up my mail client format setting ]

One option is to remove tsc from
/sys/devices/system/clocksource/clocksource*/available_clocksource
when it is detected as unstable.

That should already be happening with NOHZ or HIGHRES selected. But,
should be simple to add some code to do this always.

Would that work?

Thanks,
Venki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/