RE: [PATCH v5 1/3] x86/hyper-v: Suspend/resume the hypercall page for hibernation

From: Dexuan Cui
Date: Fri Sep 27 2019 - 02:49:40 EST


> From: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> Sent: Thursday, September 26, 2019 3:44 AM
> > [...]
> > +static int hv_suspend(void)
> > +{
> > + union hv_x64_msr_hypercall_contents hypercall_msr;
> > +
> > + /* Reset the hypercall page */
> > + rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> > + hypercall_msr.enable = 0;
> > + wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> > +
>
> (trying to think out loud, not sure there's a real issue):
>
> When PV IPIs (or PV TLB flush) are enabled we do the following checks:
>
> if (!hv_hypercall_pg)
> return false;
>
> or
> if (!hv_hypercall_pg)
> goto do_native;
>
> which will pass as we're not invalidating the pointer. Can we actually
> be sure that the kernel will never try to send an IPI/do TLB flush
> before we resume?
>
> Vitaly

When hv_suspend() and hv_resume() are called by syscore_suspend()
and syscore_resume(), respectively, all the non-boot CPUs are disabled and
only CPU0 is active and interrupts are disabled, e.g. see

hibernate() ->
hibernation_snapshot() ->
create_image() ->
suspend_disable_secondary_cpus()
local_irq_disable()

syscore_suspend()
swsusp_arch_suspend
syscore_resume

local_irq_enable
enable_nonboot_cpus


So, I'm pretty sure no IPI can happen between hv_suspend() and hv_resume().
self-IPI is not supposed to happen either, since interrupts are disabled.

IMO TLB flush should not be an issue either, unless the kernel changes page
tables between hv_suspend() and hv_resume(), which is not the case as I
checked the related code, but it looks in theory that might happen, say, in
the future, so if you insist we should save the variable "hv_hypercall_pg"
to a temporary variable and set the "hv_hypercall_pg" to NULL before we
disable the hypercall page, I would be happy to post a new version of this
patch, or we can keep this patch as is and I can make an extra patch.

Thanks,
-- Dexuan