Re: [PATCH] perf: POSIX CLOCK_PERF to report current time value

From: Ingo Molnar
Date: Wed Dec 11 2013 - 07:08:05 EST



* John Stultz <john.stultz@xxxxxxxxxx> wrote:

> [...]
>
> I'd much rather see perf export CLOCK_MONOTONIC_RAW timestamps,
> since that clockid is well defined. [...]

So the problem with that clock is that it does the following for every
timestamp:

cycle_now = clock->read(clock);

... which is impossibly slow if something like the HPET is used, which
is rather common - so this is a non-starter to timestamp perf events
with. We use the scheduler clock as a reasonable compromise between
scalability and clock globality.

I can see two solutions:

1)

One approach is what I described in my other reply a few minutes ago:
track the flow of GTOD, timestamped with the fast perf timestamps, so
that GTOD can be correlated to the perf clock, if user-space so
wishes. The correlation is simple so this gets close to the ease of
use of being able to timestamp GTOD directly.

(That would be useful for other purposes as well, such as
instrumenting GTOD updates.)

2)

An alternate, rather interesting approach would be to change the
scheduler clock offset to be influenced by the above events, so that
it quasi-approximates GTOD and emits natural time of day timestamps.

This already happens partially in the sched-clock slow path,
kernel/sched/clock.c's sched_clock_local(), it uses scd->tick_gtod
timestamps to correlate to the monotonic clock. This could be changed
over to use not get_ktime() but getnstimeofday(), to get true TOD
timestamps.

The trickier bit is the x86 fast-path, in arch/x86/kernel/tsc.c's
native_sched_clock(). That relies on __cycles_2_ns() to transform a
CPU cycles timestamp into (boot time offset) nanoseconds. For that it
uses the cyc2ns_offset percpu variable. That variable could be updated
periodically so that it's TOD offset.

My (strong!) preference would be #2, for the simple reason that it
would make perf timestamps instantly usable and tooling wouldn't have
to do anything to get true timestamps. We could add a new
PERF_SAMPLE_TIME_OF_DAY feature bit so that user-space can consciously
request GTOD timestamps. This feature bit could even be arch
influenced, so that architectures could convert their perf clocks at
the pace they desire - which tooling can detect and handle safely.

Thoughts?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/