Re: Xen PV seems to be broken on Linus' tree

From: Andy Lutomirski
Date: Wed Nov 22 2017 - 10:23:49 EST


On Wed, Nov 22, 2017 at 4:50 AM, Juergen Gross <jgross@xxxxxxxx> wrote:
> On 22/11/17 05:46, Andy Lutomirski wrote:
>> On Tue, Nov 21, 2017 at 8:11 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>> On Tue, Nov 21, 2017 at 7:33 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>>> I'm doing:
>>>>
>>>> /usr/bin/qemu-system-x86_64 -machine accel=kvm:tcg -cpu host -net none
>>>> -nographic -kernel xen-4.8.2 -initrd './arch/x86/boot/bzImage' -m 2G
>>>> -smp 2 -append console=com1
>>>>
>>>> With Linus' commit c8a0739b185d11d6e2ca7ad9f5835841d1cfc765 and the
>>>> attached config.
>>>>
>>>> It dies with a bunch of sensible log lines and then:
>>>>
>>>> (XEN) d0v0 Unhandled invalid opcode fault/trap [#6, ec=0000]
>>>> (XEN) domain_crash_sync called from entry.S: fault at ffff82d08023961a
>>>> entry.o#create_bounce_frame+0x137/0x146
>>>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>>>> (XEN) ----[ Xen-4.8.2 x86_64 debug=n Not tainted ]----
>>>> (XEN) CPU: 0
>>>> (XEN) RIP: e033:[<ffffffff811226eb>]
>>>> (XEN) RFLAGS: 0000000000000296 EM: 1 CONTEXT: pv guest (d0v0)
>>>> (XEN) rax: 000000000000002f rbx: ffffffff81e65a48 rcx: ffffffff81e71288
>>>> (XEN) rdx: ffffffff81e27500 rsi: 0000000000000001 rdi: ffffffff81133f88
>>>> (XEN) rbp: 0000000000000000 rsp: ffffffff81e03e78 r8: 0000000000000000
>>>> (XEN) r9: 0000000000000001 r10: 0000000000000000 r11: 0000000000000000
>>>> (XEN) r12: 0000000000000000 r13: 0000000000000001 r14: 0000000000000001
>>>> (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000003506e0
>>>> (XEN) cr3: 000000007b0b3000 cr2: 0000000000000000
>>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
>>>> (XEN) Guest stack trace from rsp=ffffffff81e03e78:
>>>> (XEN) ffffffff81e71288 0000000000000000 ffffffff811226eb 000000010000e030
>>>> (XEN) 0000000000010096 ffffffff81e03eb8 000000000000e02b ffffffff811226eb
>>>> (XEN) ffffffff81122c2e 0000000000000200 0000000000000000 0000000000000000
>>>> (XEN) 0000000000000030 ffffffff81c69cf5 ffffffff81080b20 ffffffff81080560
>>>> (XEN) 0000000000000000 ffffffff810d3741 ffffffff8107b420 ffffffff81094660
>>>>
>>>> Is this familiar?
>>>>
>>>> I'll feel really dumb if it ends up being my fault.
>>>
>>> Nah, it's broken at least back to v4.13, and I suspect it's config
>>> related. objdump gives me this:
>>>
>>> ffffffff8112b0e1: e9 e8 fe ff ff jmpq
>>> ffffffff8112afce <check_flags.part.42+0x4e>
>>> ffffffff8112b0e6: 48 c7 c6 2d f8 c8 81 mov $0xffffffff81c8f82d,%rsi
>>> ffffffff8112b0ed: 48 c7 c7 58 b9 c8 81 mov $0xffffffff81c8b958,%rdi
>>> ffffffff8112b0f4: e8 13 2d 01 00 callq ffffffff8113de0c <printk>
>>> ffffffff8112b0f9: 0f ff (bad) <-- crash here
>>>
>>> That's "ud0", which is used by WARN. So we're probably hitting an
>>> early warning and Xen probably has something busted with early
>>> exception handling.
>>>
>>> Anyone want to debug it and fix it?
>>
>> Well, I think I debugged it. x86_64 has a shiny function
>> idt_setup_early_handler(), and Xen doesn't call it. Fixing the
>> problem may be as simple as calling it at an appropriate time and
>> doing whatever asm magic is needed to deal with Xen's weird IDT
>> calling convention.
>
> Hmm, yes, this should work. I'll have a try.
>
> BTW: I don't think this ever worked.
>

The ud0 trick itself is fairly recent, so old enough kernels (4.10? I
don't really remember) wouldn't die just because of an early warning.