Re: Q: perf_install_in_context/perf_event_enable are racy?

From: Peter Zijlstra
Date: Mon Jan 31 2011 - 13:22:35 EST


On Mon, 2011-01-31 at 18:26 +0100, Oleg Nesterov wrote:
> On 01/28, Peter Zijlstra wrote:
> >
> > Just to give you more food for through, I couldn't help myself..
>
> Hmm. So far I am only trying to understand the perf_install_in_context()
> paths. And, after I spent almost 2 hours, I am starting to believe this
> change is probably good ;)

phew ;-)

> I do not understand the point of cpu_function_call() though, it looks
> equal to smp_call_function_single() ?

Very nearly so, except it takes a function that returns an int..

> > -static void __perf_install_in_context(void *info)
> > +static int __perf_install_in_context(void *info)
> > {
> > struct perf_event *event = info;
> > struct perf_event_context *ctx = event->ctx;
> > @@ -942,20 +1015,15 @@ static void __perf_install_in_context(void *info)
> > int err;
> >
> > /*
> > - * If this is a task context, we need to check whether it is
> > - * the current task context of this cpu. If not it has been
> > - * scheduled out before the smp call arrived.
> > - * Or possibly this is the right context but it isn't
> > - * on this cpu because it had no events.
> > + * In case we're installing a new context to an already running task,
> > + * could also happen before perf_event_task_sched_in() on architectures
> > + * which do context switches with IRQs enabled.
> > */
> > - if (ctx->task && cpuctx->task_ctx != ctx) {
> > - if (cpuctx->task_ctx || ctx->task != current)
> > - return;
> > - cpuctx->task_ctx = ctx;
> > - }
> > + if (ctx->task && !cpuctx->task_ctx)
> > + perf_event_context_sched_in(ctx);
>
> OK... This eliminates the 2nd race with __ARCH_WANT_INTERRUPTS_ON_CTXSW
> (we must not set "cpuctx->task_ctx = ctx" in case "next" is going to
> do perf_event_context_sched_in() later). So it is enough to check
> rq->curr in remote_function().

Right, but since I moved those functions into perf_event.c (they were
getting rather specific) I can no longer deref (or even obtain) a rq
structure. So it implements rq->curr == p in a somewhat round-about
fashion but it should be identical.

>
> > raw_spin_lock(&ctx->lock);
> > - ctx->is_active = 1;
> > + WARN_ON_ONCE(!ctx->is_active);
>
> This looks wrong if ctx->task == NULL.

cpuctx->ctx should still have ->is_active = 1 I think.

>
> So. With this patch it is possible that perf_event_context_sched_in()
> is called right after prepare_lock_switch(). Stupid question, why
> can't we always do this then? I mean, what if we change
> prepare_task_switch() to do
>
> perf_event_task_sched_out();
> perf_event_task_sched_in();
>
> ?
>
> Probably we can unify the COND_STMT(perf_task_events) check and simplify
> the things further.

That might work, Ingo any reason we have a pre and post hook around the
context switch and not a single function?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/