Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?

From: Andy Lutomirski
Date: Wed Mar 18 2015 - 16:49:39 EST


On Wed, Mar 18, 2015 at 1:06 PM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
> On 03/18/2015 08:26 PM, Andy Lutomirski wrote:
>> Hi Linus-
>>
>> You seem to enjoy debugging these things. Want to give this a shot?
>> My guess is a vmalloc fault accessing either old_rsp or kernel_stack
>> right after swapgs in syscall entry.
>
> The code is:
>
> ENTRY(system_call)
> SWAPGS_UNSAFE_STACK
> GLOBAL(system_call_after_swapgs)
> movq %rsp,PER_CPU_VAR(rsp_scratch)
> movq PER_CPU_VAR(kernel_stack),%rsp
>
> If PER_CPU_VAR(var) memory access can page fault
> (I was thinking this is ensured to never fault),
> then on these two instructions such page fault
> will be fatal: we will still have userspace %rsp.
>
> I thought we can only get a NMI or debug interrupt here,
> and they are both set up to use IST stacks
> to prevent this scenario (among other reasons).

I don't think that #DB is possible -- we should never have a
watchpoint on percpu memory like that (unless we're using kgdb, in
which case I think that kgdb should be fixed).

On the other hand, we can and do take page faults on percpu memory,
because percpu lives in vmap space and we lazily populate PGD entries
in per-mm PGDs. (That is, when we allocate a kernel PGD entry, we
populate it in init_mm's pgd, but we don't proactively copy it during
context switches.)

But the affected system is a laptop, so there shouldn't be CPU hotplug
or enough memory for this to happen. Confused.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/