[RFC PATCH 0/3 v2] perf: Add Intel Nehalem uncore pmu support

From: Lin Ming
Date: Sun Nov 21 2010 - 06:59:37 EST


Hi, all

Sorry for the late update, I was fixing some uncore NMI problem(see
below).
This v2 does not fully work yet, but it addresses most comments of v1.

FYI, below links are the entry for v1.
[DRAFT PATCH 0/3] perf: Add Intel Nehalem uncore pmu support
http://marc.info/?l=linux-kernel&m=128868293025309&w=2
http://marc.info/?l=linux-kernel&m=128868293025298&w=2
http://marc.info/?l=linux-kernel&m=128868296425366&w=2
http://marc.info/?l=linux-kernel&m=128868298125380&w=2

Applied on top of current tip/master(59c5300).

The main change is the uncore NMI handling code.
In the v1, I thought all the 4 cores will receive NMI when any uncore
counter overflows. So each core only handled the counters enabled by
itself in v1.
But actually only 1 of the 4 cores receives NMI each time, and we can't
determine which core receives it. So the NMI handler(running on 1 of the
4 cores) should handle all counters enabled by all 4 cores.

Changelogs of v2:

- modify the NMI handling code to handle all counters enabled by all 4
cores.

- allocate uncore_events[] table dynamically using kmalloc_node() to
avoid unnecessary remote memory accesses. (Stephane Eranian)

- add support for the fixed uncore counter. (Stephane Eranian)

- Handling of exclude_* bits. Uncore PMU measures at all privilege level
all the time. So it doesn't make sense to specify any exclude bits.
(Stephane Eranian)

- let uncore code be more self contained, that is, not include it in
arch/x86/kernel/cpu/perf_event.c (Peter Zijlstra)

- uncore pmu interrupt have its own NMI_DIE notifier entry. (Peter
Zijlstra)

Known bugs:

- When hyper thread is enabled, both HTs will receive the NMI. In this
case, the overflow status bits are not acked correctly.

TODO:

- per-task uncore event should not be allowed. Peter suggested simply
set pmu::task_ctx_nr = perf_invalid_context.

- This whole implementation is per-node, it should be converted to
per-socket. Andi Kleen has commented that using numa_node_id() implies
this implementation can't be used when NUMA is turned off. Better use
the package id.

- add support for uncore address/opcode match thing

As usual, any comment is very appreciated.

Lin Ming

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/