RE: [PATCH] x86/hyper-v: Zero out the VP assist page to fix CPU offlining
From: Dexuan Cui
Date: Thu Jul 18 2019 - 03:52:43 EST
> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Sent: Thursday, July 18, 2019 12:01 AM
> ...
> Those are two different things. The GPF_ZERO allocation makes sense on its
> own but it _cannot_ prevent the following scenario:
Hi tglx,
The scenario can be prevented.
The VP ASSIST PAGE is an "overlay" page (please see Hyper-V TLFS's Section
5.2.1 "GPA Overlay Pages", on page 38 of the spec).
The spec can be downloaded from
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs
(choose the v5.0c release)
Here is an excerpt of the section:
"
The hypervisor defines several special pages that "overlay" the guest's GPA
space. The hypercall code page is an example of an overlay page. Overlays
are addressed by Guest Physical Addresses (GPA) but are not included in the
normal GPA map maintained internally by the hypervisor. Conceptually,
they exist in a separate map that overlays the GPA map.
If a page within the GPA space is overlaid, any SPA page mapped to the
GPA page is effectively "obscured" and generally unreachable by the
virtual processor through processor memory accesses.
...
If an overlay page is disabled or is moved to a new location in the GPA
space, the underlying GPA page is "uncovered", and an existing
mapping becomes accessible to the guest.
"
Here, SPA = System Physical Address = the final real physical address.
> cpu_init()
> if (!hvp)
> hvp = vmalloc(...., GFP_ZERO);
> ...
>
> hvp->apic_assist |= 1;
When the VP ASSIST PAGE feature is enabled and the hypervisor sets
the bit in hvp->apic_assist, the bit belongs to the special SPA page.
> #1 cpu_die()
> if (....)
> wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
After the VP ASSIST PAGE is disabled, hvp->apic_assist belongs to
the "normal" SPA page mapped to the GPA.
> ---> IPI
> if (!(hvp->apic_assist & 1))
> wrmsr(APIC_EOI); <- PATH not taken
So, with the one-line patch or the three-line patch, here we're sure
vp->apic_assist.bit0 must be 0.
> #3 cpu is dead
>
> cpu_init()
> if (!hvp)
> hvp = vmalloc(....m, GFP_ZERO); <- NOT TAKEN because hvp !=
> NULL
It does not matter, because with the 1-line patch, the initial content of
the "normal" SPA page is filled with zeros; later, neither the hypervisor nor
the guest writes into the page, so the page always remains with zeros.
> So you have to come up with a better fairy tale why GFP_ZERO 'fixes' this.
>
> Allocating hvp with GFP_ZERO makes sense on it's own so the allocated
> memory has a defined state, but that's a different story.
>
> The 3 liner patch above makes way more sense and you can spare the
> local_irq_save/restore by moving the whole condition into the
> irq_save/restore region above.
>
> tglx
The concept of the "overlay page" seems weird, and frankly speaking,
I don't really understand why the Hyper-V guys invented it, but as far
as this patch here is concerned, I think the patch is safe and it can
indeed fix the CPU offlining issue I described.
Thanks,
-- Dexuan