[PATCH] x86/hyper-v: Zero out the VP assist page to fix CPU offlining

From: Dexuan Cui
Date: Wed Jul 03 2019 - 21:46:29 EST


When a CPU is being offlined, the CPU usually still receives a few
interrupts (e.g. reschedule IPIs), after hv_cpu_die() disables the
HV_X64_MSR_VP_ASSIST_PAGE, so hv_apic_eoi_write() may not write the EOI
MSR, if the apic_assist field's bit0 happens to be 1; as a result, Hyper-V
may not be able to deliver all the interrupts to the CPU, and the CPU may
not be stopped, and the kernel will hang soon.

The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section
5.2.1 "GPA Overlay Pages"), so with this fix we're sure the apic_assist
field is still zero, after the VP ASSIST PAGE is disabled.

Fixes: ba696429d290 ("x86/hyper-v: Implement EOI assist")
Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
---
arch/x86/hyperv/hv_init.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 0e033ef11a9f..db51a301f759 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -60,8 +60,14 @@ static int hv_cpu_init(unsigned int cpu)
if (!hv_vp_assist_page)
return 0;

+ /*
+ * The ZERO flag is necessary, because in the case of CPU offlining
+ * the page can still be used by hv_apic_eoi_write() for a while,
+ * after the VP ASSIST PAGE is disabled in hv_cpu_die().
+ */
if (!*hvp)
- *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL, PAGE_KERNEL);
+ *hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO,
+ PAGE_KERNEL);

if (*hvp) {
u64 val;
--
2.19.1