Re: frequent lockups in 3.18rc4

From: Steven Rostedt
Date: Thu Nov 20 2014 - 21:36:25 EST

Next message: Wanpeng Li: "[PATCH] sched/fair: fix idle balance when remaining tasks are all non-CFS tasks"
Previous message: Herbert Xu: "Re: crypto: user - Allow get request with empty driver name"
In reply to: Frederic Weisbecker: "Re: frequent lockups in 3.18rc4"
Next in thread: Thomas Gleixner: "Re: frequent lockups in 3.18rc4"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Nov 20, 2014 at 06:39:20PM -0500, Tejun Heo wrote:
> On Thu, Nov 20, 2014 at 03:08:03PM -0800, Andy Lutomirski wrote:
> > > So, for now, all we need is adding nmi check in percpu accessors,
> > > right?
> > >
> >
> > What's the issue with nmi? Page faults are supposed to nest correctly
> > inside nmi, right?
>
> Thought they couldn't. Looking at the trace that Frederic linked, it
> looks like straight-out tracing function recursion due to an
> unexpected fault while holding a lock. I don't think this can be
> annotated from percpu accessor side. There's nothing special about
> the context. :(

There use to be issues with page faults in NMI. One was that the iretq
from the page fault handler would re-enable NMIs, and if another NMI triggered
then it would stomp all over the stack of the initial NMI. But my tripple
copy of the NMI stack frame solved that. You can read all about it here:

http://lwn.net/Articles/484932/

The second bug was that if an NMI triggered right after a page fault, and
it had a page fault, the content of the cr2 register (faulting address)
would be lost for the page fault that was preempted by the NMI.
This too was solved by using (queue irony) using per_cpu variables.

Now I'm hoping that kernel boot time per_cpu variables never take any
faults, otherwise we are all f*cked!

>
> Does this matter for anybody other than tracers? Ultimately, the
> solution would be removing the vmalloc area faulting as Thomas
> suggested.

I don't know, but per_cpu variables are rather special and used all
over the place. Most other vmalloc code isn't as used as per_cpu is.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Wanpeng Li: "[PATCH] sched/fair: fix idle balance when remaining tasks are all non-CFS tasks"
Previous message: Herbert Xu: "Re: crypto: user - Allow get request with empty driver name"
In reply to: Frederic Weisbecker: "Re: frequent lockups in 3.18rc4"
Next in thread: Thomas Gleixner: "Re: frequent lockups in 3.18rc4"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]