Re: [RFC][PATCH 3/9] perf: export registerred pmus via sysfs

From: Peter Zijlstra
Date: Mon May 10 2010 - 07:42:34 EST


(Added Will Deacon)

On Mon, 2010-05-10 at 13:27 +0200, Peter Zijlstra wrote:
> On Mon, 2010-05-10 at 18:26 +0800, Lin Ming wrote:
>
> > > No, I'm assuming there is only 1 PMU per CPU. Corey is the expert on
> > > crazy hardware though, but I think the sanest way is to extend the CPU
> > > topology if there's more structure to it.
> >
> > But our goal is to support multiple pmus, don't we need to assume there
> > are more than 1 PMU per CPU?
>
> No, because as I said, then its ambiguous what pmu you want. If you have
> that, you need to extend your topology information.
>
> Anyway, I talked with Ingo on this and he'd like to see this somewhat
> extended.
>
> Instead of a pmu_id field, which we pass into a new
> perf_event_attr::pmu_id field, how about creating an event_source sysfs
> class. Then each class can have an event_source_id and a hierarchy of
> 'generic' events.
>
> We'd start using the PERF_TYPE_ space for this and express the
> PERF_COUNT_ space in the event attributes found inside that class.
>
> That way we can include all the existing event enumerations into this as
> well.
>
> This way we can create:
>
> /sys/devices/system/cpu/cpuN/cpu_hardware_events
> cpu_hardware_events/event_source_id
> cpu_hardware_events/cpu_cycles
> cpu_hardware_events/instructions
> /...
>
> /sys/devices/system/cpu/cpuN/cpu_raw_events
> cpu_raw_events/event_source_id
>
>
> These would match the current PERF_TYPE_* values for compatibility
>
> For new PMUs we can start a dynamic range of PERF_TYPE_ (say at 64k but
> that's not ABI and can be changed at any time, we've got u32 to play
> with).
>
> For uncore this would result in:
>
> /sys/devices/system/node/nodeN/node_raw_events
> node_raw_events/event_source_id
>
> and maybe:
>
> /sys/devices/system/node/nodeN/node_events
> node_events/event_source_id
> node_events/local_misses
> /local_hits
> /remote_misses
> /remote_hits
> /...
>
>
> The software events and tracepoints and kprobes stuff we could hang off
> of /sys/kernel/ or something
>
> So your registration would indeed look like something:
>
> perf_event_register_pmu(struct pmu *pmu, int type),
>
> where type would normally be -1 (dynamic) but would be PERF_TYPE_ for
> those already laid down in ABI.
>
> This approach will also give us a good overview
> in /sys/class/event_source/, which will be a flat listing of all
> existing event sources.

So Russell reminded me that the ARM people have the problem that
their /proc/cpuinfo isn't specific enough to map to a unique event map.

Whilst extending ARM /proc/cpuinfo seems like a sensible option it will
not cover anything but cpu events.

So in that trend (and to avoid exhaustive in kernel event lists for no
good reason), it might make sense to also add some event_source
attributes that identify the thing, maybe a event_source_name or
event_source_driver field that would allow unique maps to exhaustive
event lists.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/