RE: [PATCH] x86: Export tsc related information in sysfs

From: Dan Magenheimer
Date: Sun May 16 2010 - 21:33:53 EST

Next message: KOSAKI Motohiro: "Re: [PATCH 1/9] mm: add generic adaptive large memory allocationAPIs"
Previous message: Axel Lin: "[PATCH] leds-lp3944: properly handle lp3944_configure fail inlp3944_probe"
In reply to: Thomas Gleixner: "RE: [PATCH] x86: Export tsc related information in sysfs"
Next in thread: Arjan van de Ven: "Re: [PATCH] x86: Export tsc related information in sysfs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> > > From: Thomas Gleixner [mailto:tglx@xxxxxxxxxxxxx]
> > > What we can talk about is a vget_tsc_raw() interface along with a
>
> What I have in mind and what I'm working on for quite a while is going
> to work on both bare metal and VMs w/o hidden slowness.

Well, if this can be done today/soon and is fast enough
(say <2x the cycles of a rdtsc), I am very interested and
"won't let my friends use rdtsc" :-) Anything I can do
to help?

> From: Thomas Gleixner [mailto:tglx@xxxxxxxxxxxxx]
> We try to use it for performance sake, but the kernel does at least
> it's very best to find out when it goes bad. We then switch back to a
> hpet or pm-timer which is horrible performance wise but does not screw
> up timekeeping and everything which relies on it completely.
> :
> As I said, we try our very best to determine when things go awry, but
> there are small errors which occur either sporadic or after longer
> uptime which we cannot yet detect reliably. Multi-socket falls into
> that category, but we are working on that.

> From: Arjan van de Ven [mailto:arjan@xxxxxxxxxxxxx]
> Why do you think we do extensive and continuous validation of the tsc
> (and soon, continuous recalibration)

So the kernel has the ability to detect that the TSC
is "OK for now", but must use some kind of polling
(periodic warp test) to recognize that TSC has
gone "bad". As long as TSC is good AND a sophisticated
enterprise app understands that TSC might go bad at
some point in the future AND if the kernel exposes
"goodness" information AND the app (like the kernel) is
resilient** to the possibility that there might be some
period of time that obtained timestamps might be
"bad" before the app polls the kernel to find out that
the kernel says they are indeed "bad"... why should it
be forbidden for an app to use TSC?

(** e.g. increments its own tsc_last to ensure time never goes
backwards)

It seems like the only advantages the kernel has here over
a reasonably intelligent app is that: 1) the kernel can run
a warp test and the app can't, and 2) the kernel can
estimate the frequency of the TSC and the app can't.
AND, in the case of a virtual machine, the kernel has
neither of these advantages anyway.

So though I now understand and agree that neither the kernel
nor an app can guarantee that TSC won't unexpectedly go from
"good" to "bad", I still don't understand why "TSC goodness"
information shouldn't be exposed to userland, where an
intelligent enterprise app can choose to use TSC when it is good
(for the same reason the kernel does: "for performance sake")
and choose to stop using it when it goes bad (for the same
reason the kernel does: to "not screw up timekeeping").

It sounds as if you are saying that "the kernel is allowed
to use a rope because if it accidentally gets the rope
around its neck, it has a knife to ensure it doesn't hang
itself" BUT "the app isn't allowed to use a rope because
it might hang itself and we'll be damned if we loan
our knife to an app because, well... because it doesn't
need a knife because we said it shouldn't use the rope".

I think you can understand why this isn't a very satisfying
explanation.

P.S. Thanks for taking the time to discuss this!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: KOSAKI Motohiro: "Re: [PATCH 1/9] mm: add generic adaptive large memory allocationAPIs"
Previous message: Axel Lin: "[PATCH] leds-lp3944: properly handle lp3944_configure fail inlp3944_probe"
In reply to: Thomas Gleixner: "RE: [PATCH] x86: Export tsc related information in sysfs"
Next in thread: Arjan van de Ven: "Re: [PATCH] x86: Export tsc related information in sysfs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]