[PATCH 12/12] perf: Robustify task_function_call()

From: Peter Zijlstra
Date: Wed Feb 24 2016 - 12:54:47 EST


Since there is no serialization between task_function_call() doing
task_curr() and the other CPU doing context switches, we could end
up not sending an IPI even if we had to.

And I'm not sure I still buy my own argument we're OK.

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
kernel/events/core.c | 40 ++++++++++++++++++++--------------------
1 file changed, 20 insertions(+), 20 deletions(-)

--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -64,8 +64,17 @@ static void remote_function(void *data)
struct task_struct *p = tfc->p;

if (p) {
- tfc->ret = -EAGAIN;
- if (task_cpu(p) != smp_processor_id() || !task_curr(p))
+ /* -EAGAIN */
+ if (task_cpu(p) != smp_processor_id())
+ return;
+
+ /*
+ * Now that we're on right CPU with IRQs disabled, we can test
+ * if we hit the right task without races.
+ */
+
+ tfc->ret = -ESRCH; /* No such (running) process */
+ if (p != current)
return;
}

@@ -92,13 +101,17 @@ task_function_call(struct task_struct *p
.p = p,
.func = func,
.info = info,
- .ret = -ESRCH, /* No such (running) process */
+ .ret = -EAGAIN,
};
+ int ret;

- if (task_curr(p))
- smp_call_function_single(task_cpu(p), remote_function, &data, 1);
+ do {
+ ret = smp_call_function_single(task_cpu(p), remote_function, &data, 1);
+ if (!ret)
+ ret = data.ret;
+ } while (ret == -EAGAIN);

- return data.ret;
+ return ret;
}

/**
@@ -169,19 +182,6 @@ static bool is_kernel_event(struct perf_
* rely on ctx->is_active and therefore cannot use event_function_call().
* See perf_install_in_context().
*
- * This is because we need a ctx->lock serialized variable (ctx->is_active)
- * to reliably determine if a particular task/context is scheduled in. The
- * task_curr() use in task_function_call() is racy in that a remote context
- * switch is not a single atomic operation.
- *
- * As is, the situation is 'safe' because we set rq->curr before we do the
- * actual context switch. This means that task_curr() will fail early, but
- * we'll continue spinning on ctx->is_active until we've passed
- * perf_event_task_sched_out().
- *
- * Without this ctx->lock serialized variable we could have race where we find
- * the task (and hence the context) would not be active while in fact they are.
- *
* If ctx->nr_events, then ctx->is_active and cpuctx->task_ctx are set.
*/

@@ -212,7 +212,7 @@ static int event_function(void *info)
*/
if (ctx->task) {
if (ctx->task != current) {
- ret = -EAGAIN;
+ ret = -ESRCH;
goto unlock;
}