Re: [announce] Performance Counters for Linux, v6

From: Corey Ashford
Date: Fri Feb 20 2009 - 17:38:58 EST


Ingo Molnar wrote:
* Corey Ashford <cjashfor@xxxxxxxxxxxxxxxxxx> wrote:

Ingo Molnar wrote:
We are pleased to announce version 6 of our performance counters subsystem implementation. The shortlog, diffstat and the combo patch can be found below. The combo patch against latest -git (2.6.29-rc2) can be also found at:

[snip]

Hi Ingo,

As I was starting to put together a simple implementation of PAPI on top of PCL for Power, I noticed that PCL does not seem to have any sort of versioning and way of ascertaining the current capabilities of what is in the kernel.

This information is needed by tools and libraries built on top of PCL so that they can know what is supported and if any bugs need to be worked around.

I'd prefer to use the standard Linux syscall ABI convention here:

- once upstream, existing functionality is compatible forever

- new functionality is added in a way that it generates a -ENOSYS return from the syscall in an older kernel.

That's why the event structure is sized relatively large for example - to make sure we have space to grow into.

So instead of adding versioning information, it would be very nice if you could check the ABI details for 'traps' that make extensions harder. Try to come up with pie-in-the-sky future items you'd like to see in the ABI, and lets see how supportable it would be.

Example #1 - made up. Say if we had an ABI detail like this:

struct perf_counter_hw_event {
u8 type;

this would limit us to 256 events - which would be clearly stupid as we can easily hit that limit.

Example #2. Not made up:

asmlinkage int
sys_perf_counter_open(struct perf_counter_hw_event *hw_event_uptr __user,
pid_t pid, int cpu, int group_fd)

Those are 5 parameters - we could extend it to 6 and add a 'flags' value that in the current version will return -ENOSYS if the flags value is not zero.

This would add one more dimension of extensibility to the interface.

If you could come up with a list of small details like this, that would be really helpful. Would this work?

Ingo

Thanks for the reply. As I flesh out this PAPI code, I will keep thinking about these issues.

I think the method you describe is good for adding new event types and accessing fancy PMU hardware (instruction matching CAM's for example).

There may be other non-event-related changes that will not be handled quite as well in this way. In the original email I hinted at that we may want an option for mmap'd sample buffers at some point, and so I'm not clear how you'd provide an ABI to request mmap'd buffers (you would probably need to be able to request the size and get back a pointer to the mmap'd buffer). Would this be done through a special sys_perf_counter_open call? Or through a subsequent ioctl call on the group leader after an open (which requires the counters to be initially disabled), etc.

For bugs in the kernel that need to be worked around, I assume you would suggest to the tool programmers that they somehow test for the bug's presence? What if the bug causes a system crash? Perhaps a better solution for that case would be to check the kernel's version number rather than create a separate PCL version?

Thanks for your consideration,

- Corey

Corey Ashford
Software Engineer
IBM Linux Technology Center, Linux Toolchain
Beaverton, OR
503-578-3507
cjashfor@xxxxxxxxxx



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/