[tip:perf/urgent] perf/core: Do not set cpuctx->cgrp for unscheduled cgroups

From: tip-bot for David Carrillo-Cisneros
Date: Wed Nov 16 2016 - 03:37:34 EST


Commit-ID: 864c2357ca898c6171fe5284f5ecc795c8ce27a8
Gitweb: http://git.kernel.org/tip/864c2357ca898c6171fe5284f5ecc795c8ce27a8
Author: David Carrillo-Cisneros <davidcc@xxxxxxxxxx>
AuthorDate: Tue, 1 Nov 2016 11:52:58 -0700
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Tue, 15 Nov 2016 14:18:22 +0100

perf/core: Do not set cpuctx->cgrp for unscheduled cgroups

Commit:

db4a835601b7 ("perf/core: Set cgroup in CPU contexts for new cgroup events")

failed to verify that event->cgrp is actually the scheduled cgroup
in a CPU before setting cpuctx->cgrp. This patch fixes that.

Now that there is a different path for scheduled and unscheduled
cgroup, add a warning to catch when cpuctx->cgrp is still set after
the last cgroup event has been unsheduled.

To verify the bug:

# Create 2 cgroups.
mkdir /dev/cgroups/devices/g1
mkdir /dev/cgroups/devices/g2

# launch a task, bind it to a cpu and move it to g1
CPU=2
while :; do : ; done &
P=$!

taskset -pc $CPU $P
echo $P > /dev/cgroups/devices/g1/tasks

# monitor g2 (it runs no tasks) and observe output
perf stat -e cycles -I 1000 -C $CPU -G g2

# time counts unit events
1.000091408 7,579,527 cycles g2
2.000350111 <not counted> cycles g2
3.000589181 <not counted> cycles g2
4.000771428 <not counted> cycles g2

# note first line that displays that a task run in g2, despite
# g2 having no tasks. This is because cpuctx->cgrp was wrongly
# set when context of new event was installed.
# After applying the fix we obtain the right output:

perf stat -e cycles -I 1000 -C $CPU -G g2
# time counts unit events
1.000119615 <not counted> cycles g2
2.000389430 <not counted> cycles g2
3.000590962 <not counted> cycles g2

Signed-off-by: David Carrillo-Cisneros <davidcc@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Kan Liang <kan.liang@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Nilay Vaish <nilayvaish@xxxxxxxxx>
Cc: Paul Turner <pjt@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Stephane Eranian <eranian@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Vegard Nossum <vegard.nossum@xxxxxxxxx>
Link: http://lkml.kernel.org/r/1478026378-86083-1-git-send-email-davidcc@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/events/core.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 0e29213..ff230bb 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -902,6 +902,17 @@ list_update_cgroup_event(struct perf_event *event,
* this will always be called from the right CPU.
*/
cpuctx = __get_cpu_context(ctx);
+
+ /* Only set/clear cpuctx->cgrp if current task uses event->cgrp. */
+ if (perf_cgroup_from_task(current, ctx) != event->cgrp) {
+ /*
+ * We are removing the last cpu event in this context.
+ * If that event is not active in this cpu, cpuctx->cgrp
+ * should've been cleared by perf_cgroup_switch.
+ */
+ WARN_ON_ONCE(!add && cpuctx->cgrp);
+ return;
+ }
cpuctx->cgrp = add ? event->cgrp : NULL;
}