Re: NMI between switch_mm and switch_to

From: Ingo Molnar
Date: Mon Aug 03 2009 - 04:29:39 EST



* Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:

> On Tue, 2009-07-28 at 14:49 +1000, Paul Mackerras wrote:
>
> > Ben H. suggested there might be a problem if we get a PMU
> > interrupt and try to do a stack trace of userspace in the
> > interval between when we call switch_mm() from
> > sched.c:context_switch() and when we call switch_to(). If we
> > get an NMI in that interval and do a stack trace of userspace,
> > we'll see the registers of the old task but when we peek at user
> > addresses we'll see the memory image for the new task, so the
> > stack trace we get will be completely bogus.
> >
> > Is this in fact also a problem on x86, or is there some subtle
> > reason why it can't happen there?
>
> I can't spot one, maybe Ingo can when he's back :-)
>
> So I think this is very good spotting from Ben.

Yeah.

> We could use preempt notifiers (or put in our own hooks) to
> disable callchains during the context switch I suppose.

I think we should only disable user call-chains i think - the
in-kernel call-chain is still reliable.

Also, i think we dont need preempt notifiers, we can use a simple
check like this:

if (current->mm &&
cpu_isset(smp_processor_id(), &current->mm->cpu_vm_mask) {

...
}

In the user-call-chain code. We'd only touch the user memory image
if that bit is set. cpu_vm_mask is maintained atomically and before
we switch the MM, so it should be race-free.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/