Re: [RFC] perf_events: support for uncore a.k.a. nest units

From: Corey Ashford
Date: Thu Jan 21 2010 - 14:13:48 EST




On 1/20/2010 11:21 PM, Ingo Molnar wrote:

* Corey Ashford<cjashfor@xxxxxxxxxxxxxxxxxx> wrote:

I really think we need some sort of data structure which is passed from the
kernel to user space to represent the topology of the system, and give
useful information to be able to identify each PMU node. Whether this is
done with a sysfs-style tree, a table in a file, XML, etc... it doesn't
really matter much, but it needs to be something that can be parsed
relatively easily and *contains just enough information* for the user to be
able to correctly choose PMUs, and for the kernel to be able to relate that
back to actual PMU hardware.

The right way would be to extend the current event description under
/debug/tracing/events with hardware descriptors and (maybe) to formalise this
into a separate /proc/events/ or into a separate filesystem.

The advantage of this is that in the grand scheme of things we _really_ dont
want to limit performance events to 'hardware' hierarchies, or to
devices/sysfs, some existing /proc scheme, or any other arbitrary (and
fundamentally limiting) object enumeration.

We want a unified, logical enumeration of all events and objects that we care
about from a performance monitoring and analysis point of view, shaped for the
purpose of and parsed by perf user-space. And since the current event
descriptors are already rather rich as they enumerate all sorts of things:

- tracepoints
- hw-breakpoints
- dynamic probes

etc., and are well used by tooling we should expand those with real hardware
structure.

This is an intriguing idea; I like the idea of generalizing all of this info into one structure.

So you think that this structure should contain event info as well? If these structures are created by the kernel, I think that would necessitate placing large event tables into the kernel, which is something I think we'd prefer to avoid because of the amount of memory it would take. Keep in mind that we need not only event names, but event descriptions, encodings, attributes (e.g. unit masks), attribute descriptions, etc. I suppose the kernel could read a file from the file system, and then add this info to the tree, but that just seems bad. Are there existing places in the kernel where it reads a user space file to create a user space pseudo filesystem?

I think keeping event naming in user space, and PMU naming in kernel space might be a better idea: the kernel exposes the available PMUs to user space via some structure, and a user space library tries to recognize the exposed PMUs and provide event lists and other needed info. The perf tool would use this library to be able to list available events to users.

--
Regards,

- Corey

Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
503-578-3507
cjashfor@xxxxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/