Re: [patch] Performance Counters for Linux, v3

From: Ingo Molnar
Date: Thu Dec 11 2008 - 14:50:42 EST



* Tony Luck <tony.luck@xxxxxxxxx> wrote:

> > /*
> > * Special "software" counters provided by the kernel, even if
> > * the hardware does not support performance counters. These
> > * counters measure various physical and sw events of the
> > * kernel (and allow the profiling of them as well):
> > */
> > PERF_COUNT_CPU_CLOCK = -1,
> > PERF_COUNT_TASK_CLOCK = -2,
> > /*
> > * Future software events:
> > */
> > /* PERF_COUNT_PAGE_FAULTS = -3,
> > PERF_COUNT_CONTEXT_SWITCHES = -4, */
>
> ...
> > +[ Note: more hw_event_types are supported as well, but they are CPU
> > + specific and are enumerated via /sys on a per CPU basis. Raw hw event
> > + types can be passed in as negative numbers. For example, to count
> > + "External bus cycles while bus lock signal asserted" events on Intel
> > + Core CPUs, pass in a -0x4064 event type value. ]
>
> It looks like you have an overlap here. You are using some negative
> numbers to denote your special software events, but also as "raw"
> hardware events. What if these conflict?

that's an old comment, not a bug in the code - thx for pointing it out, i
just fixed the comments - see the commit below.

Raw events are now done without using up negative numbers, they are done
via:

struct perf_counter_hw_event {
s64 type;

u64 irq_period;
u32 record_type;

u32 disabled : 1, /* off by default */
nmi : 1, /* NMI sampling */
raw : 1, /* raw event type */
__reserved_1 : 29;

u64 __reserved_2;
};

if the hw_event.raw bit is set to 1, then the hw_event.type is fully
'raw'. The default is for raw to be 0. So negative numbers can be used
for sw events, positive numbers for hw events. Both can be extended
gradually, without arbitrarily limits introduced.

Ingo

------------------------->