performance monitor interface?

David Mosberger (davidm@AZStarNet.com)
Tue, 22 Aug 1995 10:34:27 -0700


Neither Linux nor any of the commercial Unix's on the market provide
even halfway decent support for performance measurements. And this
despite the fact that many recent processors have fairly sophisticated
performance monitoring support. For example, both the Alpha and
Pentium chips provide performance counters that allow counting events
such as cache-misses, CPU stalls, or branch-mispredictions. In
contrast, most OSes that I know of do not even provide the means to
flush caches reliably. In some, but not all cases this can be achived
at the user-level. For example, there presently seems to be no
(clean/decent) way to flush the 2nd-level cache on either Linux or DEC
Unix running on an Alpha.

Thus, I wonder whether anybody has already considered designing and
implementing a reasonably portable interface that would provide:

o memory-system information (structure of memory
system, performance parameters)

o access to performance monitor hardware (where present); e.g.,
explicit measurement/counting of events and/or performance
counter based profiling (e.g., to obtain a histogram of
cache-misses)

o control over memory system state (e.g., flushing of individual
caches, TLB, etc.)

Obviously, some of these operations are security sensitive but it is
quite acceptable if these facilities are available to the superuser
only. For controlled experimentation, it will in many cases be
necessary to run the system in single-user mode anyway.

Also, notice that oftentimes much of such an interface could be
implemented at the user-level. What I envision is a minimal and
possibly architecture-dependent interface with library support to
provide the full and (mostly) architecture independent interface.

If anybody is interested in seriously pursuing this idea, here some
more info:

o the latest GNU gprof (available in the form of a snapshot)
has support for fine-grained (line-by-line) and non-real-time
based profiling

o free code for accessing the EV4 (aka 21064 Alpha CPU) performance
counter is available; if interested, drop me a note

If this is done properly, I think Linux could have a real edge over
commercial OSes for the research community and anybody else interested
in understanding system performance.

Any feedback would be appreciated.

--david