[tip:perf/urgent] perf: Fix task context scheduling

From: tip-bot for Peter Zijlstra
Date: Thu Mar 31 2011 - 08:42:21 EST


Commit-ID: ab711fe08297de1485fff0a366e6db8828cafd6a
Gitweb: http://git.kernel.org/tip/ab711fe08297de1485fff0a366e6db8828cafd6a
Author: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
AuthorDate: Thu, 31 Mar 2011 10:29:26 +0200
Committer: Ingo Molnar <mingo@xxxxxxx>
CommitDate: Thu, 31 Mar 2011 13:02:55 +0200

perf: Fix task context scheduling

Jiri reported:

|
| - once an event is created by sys_perf_event_open, task context
| is created and it stays even if the event is closed, until the
| task is finished ... thats what I see in code and I assume it's
| correct
|
| - when the task opens event, perf_sched_events jump label is
| incremented and following callbacks are started from scheduler
|
| __perf_event_task_sched_in
| __perf_event_task_sched_out
|
| These callback *in/out set/unset cpuctx->task_ctx value to the
| task context.
|
| - close is called on event on CPU 0:
| - the task is scheduled on CPU 0
| - __perf_event_task_sched_in is called
| - cpuctx->task_ctx is set
| - perf_sched_events jump label is decremented and == 0
| - __perf_event_task_sched_out is not called
| - cpuctx->task_ctx on CPU 0 stays set
|
| - exit is called on CPU 1:
| - the task is scheduled on CPU 1
| - perf_event_exit_task is called
| - task_ctx_sched_out unsets cpuctx->task_ctx on CPU 1
| - put_ctx destroys the context
|
| - another call of perf_rotate_context on CPU 0 will use invalid
| task_ctx pointer, and eventualy panic.
|

Cure this the simplest possibly way by partially reverting the
jump_label optimization for the sched_out case.

Reported-and-tested-by: Jiri Olsa <jolsa@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
Cc: <stable@xxxxxxxxxx> # .37+
LKML-Reference: <1301520405.4859.213.camel@twins>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
include/linux/perf_event.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 311b4dc..04d75a8 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1086,7 +1086,7 @@ void perf_event_task_sched_out(struct task_struct *task, struct task_struct *nex
{
perf_sw_event(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, 1, NULL, 0);

- COND_STMT(&perf_sched_events, __perf_event_task_sched_out(task, next));
+ __perf_event_task_sched_out(task, next);
}

extern void perf_event_mmap(struct vm_area_struct *vma);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/