[PATCH V2 1/6] perf,core: allow invalid context events to be part of sw/hw groups

From: Kan Liang
Date: Wed Apr 15 2015 - 11:10:06 EST


From: Kan Liang <kan.liang@xxxxxxxxx>

The pmu marked as perf_invalid_context don't have any state to switch on
context switch. Everything is global. So it is OK to be part of sw/hw
groups.
In sched_out/sched_in, del/add must be called, so the
perf_invalid_context event can be disabled/enabled accordingly during
context switch. The event count only be read when the event is already
sched_in.

However group read doesn't work with mix events.

For example,
perf record -e '{cycles,uncore_imc_0/cas_count_read/}:S' -a sleep 1
It always gets EINVAL.

This patch set intends to fix this issue.
perf record -e '{cycles,uncore_imc_0/cas_count_read/}:S' -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.202 MB perf.data (12 samples) ]

This patch special case invalid context events and allow them to be part
of sw/hw groups.

Signed-off-by: Kan Liang <kan.liang@xxxxxxxxx>
---

Changes since V1
- For pure invalid context event group, using leader's pmu as
event members pmu.

include/linux/perf_event.h | 8 +++++
kernel/events/core.c | 76 ++++++++++++++++++++++++++++++++++------------
2 files changed, 65 insertions(+), 19 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 61992cf..ecc80fa 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -742,6 +742,14 @@ static inline bool is_sampling_event(struct perf_event *event)
/*
* Return 1 for a software event, 0 for a hardware event
*/
+static inline int is_invalid_context_event(struct perf_event *event)
+{
+ return event->pmu->task_ctx_nr == perf_invalid_context;
+}
+
+/*
+ * Return 1 for a software event, 0 for a hardware event
+ */
static inline int is_software_event(struct perf_event *event)
{
return event->pmu->task_ctx_nr == perf_sw_context;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 06917d5..519ae0c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1351,7 +1351,7 @@ static void perf_group_attach(struct perf_event *event)
WARN_ON_ONCE(group_leader->ctx != event->ctx);

if (group_leader->group_flags & PERF_GROUP_SOFTWARE &&
- !is_software_event(event))
+ !is_software_event(event) && !is_invalid_context_event(event))
group_leader->group_flags &= ~PERF_GROUP_SOFTWARE;

list_add_tail(&event->group_entry, &group_leader->sibling_list);
@@ -7946,8 +7946,11 @@ SYSCALL_DEFINE5(perf_event_open,
account_event(event);

/*
- * Special case software events and allow them to be part of
- * any hardware group.
+ * Special case for software events and invalid context events.
+ * Allow software events to be part of any hardware group.
+ * Invalid context events can only be the group leader for pure
+ * invalid context event group, but could be part of any
+ * software/hardware group.
*/
pmu = event->pmu;

@@ -7958,27 +7961,62 @@ SYSCALL_DEFINE5(perf_event_open,
}

if (group_leader &&
- (is_software_event(event) != is_software_event(group_leader))) {
- if (is_software_event(event)) {
+ (group_leader->pmu->task_ctx_nr != event->pmu->task_ctx_nr)) {
+ if (is_invalid_context_event(group_leader)) {
+ err = -EINVAL;
+ goto err_alloc;
+ } else if (is_software_event(group_leader)) {
+ if (is_invalid_context_event(event)) {
+ if (group_leader->group_flags & PERF_GROUP_SOFTWARE) {
+ /*
+ * If group_leader is software event
+ * and event is invalid context event
+ * allow the addition of invalid
+ * context event to software groups.
+ */
+ pmu = group_leader->pmu;
+ } else {
+ /*
+ * Group leader is software event,
+ * but the group is not software event.
+ * There must be hardware event in group,
+ * find it and set it's pmu to event->pmu.
+ */
+ struct perf_event *tmp;
+
+ list_for_each_entry(tmp, &group_leader->sibling_list, group_entry) {
+ if (tmp->pmu->task_ctx_nr == perf_hw_context) {
+ pmu = tmp->pmu;
+ break;
+ }
+ }
+ if (pmu == event->pmu)
+ goto err_alloc;
+ }
+ } else {
+ if (group_leader->group_flags & PERF_GROUP_SOFTWARE) {
+ /*
+ * In case the group is pure software group,
+ * and we try to add a hardware event,
+ * move the whole group to hardware context.
+ */
+ move_group = 1;
+ }
+ }
+ } else {
/*
- * If event and group_leader are not both a software
- * event, and event is, then group leader is not.
- *
- * Allow the addition of software events to !software
- * groups, this is safe because software events never
- * fail to schedule.
+ * If group_leader is hardware event and event is not,
+ * allow the addition of !hardware events to hardware
+ * groups. This is safe because software events and
+ * invalid context events never fail to schedule.
*/
pmu = group_leader->pmu;
- } else if (is_software_event(group_leader) &&
- (group_leader->group_flags & PERF_GROUP_SOFTWARE)) {
- /*
- * In case the group is a pure software group, and we
- * try to add a hardware event, move the whole group to
- * the hardware context.
- */
- move_group = 1;
}
}
+ if (group_leader &&
+ is_invalid_context_event(group_leader) &&
+ is_invalid_context_event(event))
+ pmu = group_leader->pmu;

/*
* Get the target context (task or percpu):
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/