Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?

From: Linus Torvalds
Date: Wed Mar 18 2015 - 22:15:34 EST


On Wed, Mar 18, 2015 at 5:57 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
>> sp = 140735967860552,
>
> 0x7fffa55f1748
>
> Note that the double fault happened with rsp == 0x00007fffa55eafb8,
> which is the saved rsp here - 0x6790. That difference kind of large
> to make sense if this is a sysret problem. Not that I have a better
> explanation...

Actually, that kind of large difference is what I'd expect if it's a
GP fault on sysret then cascades to more faults because our kernel
stack pointer is crap.

So it starts with getting a GP fault due to the sysret, but now we're
in la-la-land with really odd core register state, so what's not to
say that we don't get a recursive fault. We don't use the kernel stack
pointer for getting thread-info any more like we used to, but we still
have code like this in entry_64.c:

testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)

which seems to know that the thread info is below the kernel stack. So
let's say that the GP fault starts taking a recursive GP faults (or
recursive page faults) due to confusion with thread_info accesses or
something. And the stack keeps growing down, because all the faults
just fault themselves. Until finally we hit an unmapped area, and that
stops it - because while we had recursive faulting before, it was our
kernel code that was confused. But now the fault handling ends up
takiung a page fault while setting up the error information.

You would *not* expect the stack to be unmapped just under the
original %rsp value. User space has big frames and probably had deep
call chains before it ever hit the problematic case, so there's some
"slop" on the user stack. Only when we run out of slop do we get the
double-fault. Which explains why you should *not* expect the %rsp
values to be similar.

And around 30kB of stack before that happens sounds quite reasonable.

Now, to be honest, I don't see why we'd get the cascading faults, I
just get this feeling that if %rsp is crap, just about anything might
go wrong, and that if it's sysret taking a #GP fault, we're just
screwed.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/