Re: [RFC] perf: need to expose sched_clock to correlate user sampleswith kernel samples

From: John Stultz
Date: Wed Nov 14 2012 - 17:26:22 EST


On 11/13/2012 12:58 PM, Steven Rostedt wrote:
On Fri, 2012-11-09 at 18:04 -0800, John Stultz wrote:
On 10/16/2012 10:23 AM, Peter Zijlstra wrote:
I've no problem with adding CLOCK_PERF (or another/better name).
Hrm. I'm not excited about exporting that sort of internal kernel
details to userland.

The behavior and expectations from sched_clock() has changed over the
years, so I'm not sure its wise to export it, since we'd have to
preserve its behavior from then on.

Also I worry that it will be abused in the same way that direct TSC
access is, where the seemingly better performance from the more
careful/correct CLOCK_MONOTONIC would cause developers to write fragile
userland code that will break when moved from one machine to the next.

I'd probably rather perf output timestamps to userland using sane clocks
(CLOCK_MONOTONIC), rather then trying to introduce a new time domain to
userland. But I probably could be convinced I'm wrong.
I'm surprised that perf has its own clock anyway. But I would like to
export the tracing clocks. We have three (well four) of them:

trace_clock_local() which is defined to be a very fast clock but may not
be synced with other cpus (basically, it just calls sched_clock).

trace_clock() which is not totally serialized, but also not totally off
(between local and global). This uses local_clock() which is the same
thing that perf_clock() uses.

trace_clock_global() which is a monotonic clock across CPUs. It's much
slower than the above, but works well when you require synced
timestamps.

There's also trace_clock_counter() which isn't even a clock :-) It's
just a incremental atomic counter that goes up every time it's called.
This is the most synced clock, but is absolutely meaningless for
timestamps. It's just a way to show ordered events.

Oof. This is getting uglier. I'd really prefer not to expose all these different internal clocks out userland. Especially via clock_gettime().

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/