Re: [PATCH 01/13] perf/core: Add perf_arch_regs and mask to perf_regs structure

From: Madhavan Srinivasan
Date: Tue Sep 06 2016 - 00:26:15 EST




On Thursday 01 September 2016 12:56 PM, Peter Zijlstra wrote:
On Mon, Aug 29, 2016 at 02:30:46AM +0530, Madhavan Srinivasan wrote:
It's a perennial request from hardware folks to be able to
see the raw values of the pmu registers. Partly it's so that
they can verify perf is doing what they want, and some
of it is that they're interested in some of the more obscure
info that isn't plumbed out through other perf interfaces.
How much and what is that? Can't we try and get interfaces sorted?

We have bunch of registers which exports information regarding the
sampled instruction like SIER/SIAR/SDAR/MMCRA. Lot of bits in these
registers are not yet architected and incase of SIER register, some of
the bits are not plumbed out and we are working on getting some these
exposed via perf.


Over the years internally have used various hack to get
the requested data out but this is an attempt to use a
somewhat standard mechanism (using PERF_SAMPLE_REGS_INTR).
Not really liking that. It assumes too much and doesn't seem to cover
about half the perf use-cases.

It assumes the machine state can be captured by registers (this is false
for things like Intel DS/PT, which have state in memory), it might
assume <= 64 registers but I didn't look that closely, this too might
become somewhat restrictive.

Worse, it doesn't work for !sampling workloads, of which you also very
much want to verify programming etc.

Yes, I agree, my bad. I did assume and implemented considering
pmu registers primarily, but we can extend with additional flags
on the content being copied. Good point that patchset not handling
!sampling case. Let me explore on this and also the tracing options.

Thanks for the comments.
Maddy


This would also be helpful for those of us working on the perf
hardware backends, to be able to verify that we're programming
things correctly, without resorting to debug printks etc.
On x86 we can trace the MSR writes. No need to add debug printk()s.
We could (and I have on occasion) added tracepoints (well trace_printk)
to the Intel DS memory stores to see what was written there.

Tracing is much more flexible for debugging this stuff.

Can't you do something along those lines?