Re: [RFC PATCH 1/3] Unified trace buffer
From: Mathieu Desnoyers
Date: Thu Sep 25 2008 - 16:29:45 EST
* Ingo Molnar (mingo@xxxxxxx) wrote:
>
> * Ingo Molnar <mingo@xxxxxxx> wrote:
>
> > firstly, for the sake of full disclosure, the very first versions of
> > the latency tracer (which, through hundreds of revisions, morphed into
> > ftrace), used raw TSC timestamps.
> >
> > I stuck to that simple design for a _long_ time because i shared your
> > exact views about robustness and simplicity. But it was pure utter
> > nightmare to get the timings right after the fact, and i got a _lot_
> > of complaints about the quality of timings, and i could never _trust_
> > the timings myself for certain types of analysis.
> >
> > So i eventually went to the scheduler clock and never looked back.
> >
> > So i've been there, i've done that. In fact i briefly tried to use the
> > _GTOD_ clock for tracing - that was utter nightmare as well, because
> > the scale and breath of the GTOD code is staggering.
>
> heh, and i even have a link for a latency tracing patch for 2005 that is
> still alive that proves it:
>
> http://people.redhat.com/mingo/latency-tracing-patches/patches/latency-tracing.patch
>
> (dont look at the quality of that code too much)
>
> It has this line for timestamp generation:
>
> + timestamp = get_cycles();
>
> i.e. we used the raw TSC, we used RDTSC straight away, and we used that
> for _years_, literally.
>
> So i can tell you my direct experience with it: i had far more problems
> with the tracer due to inexact timings and traces that i could not
> depend on, than i had problems with sched_clock() locking up or
> crashing.
>
> Far more people complained about the accuracy of timings than about
> performance or about the ability (or inability) to stream gigs of
> tracing data to user-space.
>
> It was a very striking difference:
>
> - every second person who used the tracer observed that the timings
> looked odd at places.
>
> - only every 6 months has someone asked whether he could save
> gigabytes of trace data.
>
> For years i maintained a tracer with TSC timestamps, and for years i
> maintained another tracer that used sched_clock(). Exact timings are a
> feature most people are willing to spend extra cycles on.
>
> You seem to dismiss that angle by calling my arguments bullshit, but i
> dont know on what basis you dismiss it. Sure, a feature and extra
> complexity _always_ has a robustness cost. If your argument is that we
> should move cpu_clock() to assembly to make it more dependable - i'm all
> for it.
>
> Ingo
>
Hi Ingo,
I completely agree with both Linus and you that accuracy utterly
matters. I currently provide a time source meant to meant the tracing
requirements and support architectures lacking synchronized TSC (or tsc
at all) in my lttng tree. Feel free to have a look. I've had statisfied
users relying on these time sources for about 3 years.
See the lttng-timestamp-* commits in
git://git.kernel.org/pub/scm/linux/kernel/git/compudj/linux-2.6-lttng.git
The one in question here (x86) is here. You'll see that everything fits
in a small header and can thus be inlined in the callers.
http://git.kernel.org/?p=linux/kernel/git/compudj/linux-2.6-lttng.git;a=blob;f=include/asm-x86/ltt.h;h=96ef292729a15d93af020ce5526669d220a1d795;hb=5fced7ecdac8ce65298ddbad191ce9fe998cfe9a
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/