Re: Perhaps a side effect regarding NMI returns

From: Steven Rostedt
Date: Tue Nov 29 2011 - 15:58:23 EST


On Tue, 2011-11-29 at 12:36 -0800, Linus Torvalds wrote:
> On Tue, Nov 29, 2011 at 12:31 PM, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> >
> > As a simple fix your proposal of forcing IRET sounds good.
>
> We could of course use iret to return to the regular kernel stack, and
> do the schedule from there.
>
> So instead of doing the manual stack switch, just build a fake iret
> stack on our exception stack. Subtle and somewhat complicated. I'd
> almost rather just do a blind iret, and leave the 'iret to regular
> stack' as a possible future option.

Note, the reason that I've been looking at this code, is because I'm
looking at implementing your idea to handle irets in NMIs, caused by
faults, exceptions, and the reason I really care about: debugging.

Your proposal is here:

https://lkml.org/lkml/2010/7/14/264

But to make this work, it would be really nice if the NMI routine wasn't
convoluted with the paranoid_exit code.

For things like static_branch()/jump_label and modifying ftrace nops to
calls and back, we currently use the big hammer approach stop_machine().
This keeps another CPU from executing code that is being modified.
There's also tricks to handle NMIs that may be running on the stopped
CPUs.

But people don't like the overhead that stop_machine() causes, and I
have code that can make the modifications for ftrace with break points.
By adding a break point, syncing, then modifying the code and break
point to a new op will greatly reduce the overhead. At least the latency
will be much less.

The problem is that ftrace affects code in NMIs. We tried to not trace
NMIs, but there's so many functions that NMIs call, it ended up being a
losing battle. But if we can fix the NMI enabled on iret, we can then
use the break point scheme for both static_branch() and ftrace, and
remove the overhead of stop_machine. I think there's a possibility to
use kprobes in NMIs too, with this fix.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/