Re: [PATCH 0/2] perf/x86: Add ability to sample TSC

From: Adrian Hunter
Date: Thu Feb 19 2015 - 10:56:43 EST

On 19/02/2015 5:05 p.m., Peter Zijlstra wrote:
On Thu, Feb 19, 2015 at 04:38:57PM +0200, Adrian Hunter wrote:
On 19/02/15 15:50, Peter Zijlstra wrote:
On Thu, Feb 19, 2015 at 02:11:08PM +0200, Adrian Hunter wrote:

With the advent of switching perf_clock to CLOCK_MONOTONIC,
it will not be possible to convert perf_clock directly to/from
TSC. So add the ability to sample TSC instead.

Well, you can, mostly. MONOTONIC is only affected by NTP slew rate
changes, not offset changes.

man page says is also subject to adjtime(3)

which is slew adjustment; read the adjtime manpage :-)

And NTP limits the slew rate to 500 PPM, so even if you would get a

Assuming it is not broken.

NTP people are a cautious crowd, sure they get it wrong just like the
rest of us, but mostly it needs to work.

slew change and then not update the userpage data for a second you'd be
maximally off by 0.0005 seconds.

That could still be enough to break the decoder. It will certainly
misrepresent the order of events, which is a big loss of information.

What decoder? perf report is already subject to much larger shifts in
time if you run it on say a core2 machine.

Any decoder of Intel PT data. Side-band events like sched_switch or mmap
have to be sync'ed with Intel PT TSC timestamps to decode the trace. But
synchronizing any kind of event could be useful for analysis.

And that is way below what the current perf clock guarantees on funny

If you're really worried about this; we could maybe get John and Thomas
to allow us a callback on every slew change so we can update the
userpage data ASAP, much reducing the max error.

Say it takes a 10e5 cycles to update your userpage, then you're never
further off than 50 cycles, which is below your ART multiplier.

You still need to wake up user space to read the userpage.

Uhm what? Userspace is already awake.

For Intel PT recording, perf record will be sleeping on poll().

Does that really matter? Also, if you have a stable crystal, the slew
rate change should be minimal and infrequent, never getting you close to
these numbers.

So no, I'm not convinced we need this.

Adding TSC to the sample is a lot simpler and more accurate.

Finding multiple samples and interpolating between them is much simpler
than reading tsc and doing the mult, shift and offset addition?

I suspect you're talking about something else entirely; your changelogs
are inadequate for they tell ntohing of your usecase and have me
guessing. Don't do that.

Sorry. I did mention Intel PT in patch 2, but I basically assumed the
need to synchronize events with other time sources was understood.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at