HW perf. events arch implementation
From: Michael Cree
Date: Tue Feb 23 2010 - 21:33:38 EST
I am trying to implement arch specific code on the Alpha for hardware
performance events (yeah, I'm probably a little bit loopy and unsound
of mind pursuing this on an end-of-line platform, but it's a way in to
learn a little bit of kernel programming and it scratches an itch).
I have taken a look at the code in the x86, sparc and ppc
implementations and tried to drum up an Alpha implementation for the
EV67/7/79 cpus, but it ain't working and is producing obviously
erroneous counts. Part of the problem is that I don't understand
under what conditions, and with what assumptions, the performance
event subsystem is calling into the architecture specific code. Is
there any documentation available that describes the architecture
specific interface?
The Alpha CPUs of interest have two 20-bit performance monitoring
counters that can count cycles, instructions, Bcache misses and Mbox
replays (but not all combinations of those). For round numbers
consider a 1GHz CPU, with a theoretical maximal sustained throughput
of four instructions per cycle, then a single performance counter
could potentially generate 4000 interrupts per second to signal
counter overflow when counting instructions.
The x86, sparc and PPC implementations seem to me to assume that calls
to read back the counters occur more frequently than performance
counter overflow interrupts, and that the highest bit of the counter
can safely be used to detect overflow. (Am I correct?) That is
likely not to be true of the Alpha because of the small width of the
counter. Is there someone who would be happy to give me, a kernel
newbie who probably doesn't even make the grade of neophyte, a bit of
direction on this?
Also, the Alpha CPUs have an interesting mode whereby one programmes
up one counter with a specified (or random) value that specifies a
future instruction to profile. The CPU runs for that number of
instructions/cycles, then a short monitoring window (of a few cycles)
is opened about the profiled instruction and when completed an
interrupt is generated. One can then read back a whole lot of
information about the pipeline at the time of the profiled
instruction. This can be used for statistical sampling. Does the
performance events subsystem support monitoring with such a mode?
Cheers
Michael.
--
To unsubscribe from this list: send the line "unsubscribe linux-alpha" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html