Re: [PATCH v2] x86-64/Xen: fix stack switching

From: Jan Beulich
Date: Thu Nov 22 2018 - 03:07:11 EST


>>> On 21.11.18 at 16:24, <luto@xxxxxxxxxx> wrote:
> On Wed, Nov 21, 2018 at 2:10 AM Jan Beulich <JBeulich@xxxxxxxx> wrote:
>> --- 4.20-rc3/arch/x86/entry/entry_64.S
>> +++ 4.20-rc3-x86_64-stack-switch-Xen/arch/x86/entry/entry_64.S
>> @@ -1380,6 +1380,12 @@ ENTRY(nmi)
>> swapgs
>> cld
>> SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
>> +
>> + movq PER_CPU_VAR(cpu_current_top_of_stack), %rdx
>> + subq $8, %rdx
>> + xorq %rsp, %rdx
>> + shrq $PAGE_SHIFT, %rdx
>> + jz .Lnmi_keep_stack
>
> This code shouldn't even be reachable on Xen:
>
> commit 43e4111086a70c78bedb6ad990bee97f17b27a6e
> Author: Juergen Gross <jgross@xxxxxxxx>
> Date: Thu Nov 2 00:59:07 2017 -0700
>
> xen, x86/entry/64: Add xen NMI trap entry
>
> Instead of trying to execute any NMI via the bare metal's NMI trap
> handler use a Xen specific one for PV domains, like we do for e.g.
> debug traps. As in a PV domain the NMI is handled via the normal
> kernel stack this is the correct thing to do.
>
> This will enable us to get rid of the very fragile and questionable
> dependencies between the bare metal NMI handler and Xen assumptions
> believed to be broken anyway.

Oh, I didn't notice this. The beginnings of the patch here pre-date
this, though, and then I didn't notice the addition. Thanks for
pointing this out.

>> --- 4.20-rc3/arch/x86/entry/entry_64_compat.S
>> +++ 4.20-rc3-x86_64-stack-switch-Xen/arch/x86/entry/entry_64_compat.S
>> @@ -361,17 +361,23 @@ ENTRY(entry_INT80_compat)
>>
>> /* Need to switch before accessing the thread stack. */
>> SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
>> +
>> + movq PER_CPU_VAR(cpu_current_top_of_stack), %rdi
>> + subq $8, %rdi
>> + xorq %rsp, %rdi
>> + shrq $PAGE_SHIFT, %rdi
>> + jz .Lint80_keep_stack
>
> This comparison is IMO the wrong test entirely. How about something like:
>
> /* On Xen PV, entry_INT80_compat is called on the thread stack, so
> rewinding to the top of the thread stack would allow an NMI to
> overwrite the hardware frame before we copy it. */
> ALTERNATIVE "", "jmp .Lint80_keep_stack", X86_FEATURE_XENPV

Indeed I had noted this as an alternative option in v1, but
didn't get respective feedback. If that's the preferred route, I'll
certainly switch.

Jan