Re: [PATCH] KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall

From: Maxim Levitsky
Date: Thu Dec 09 2021 - 05:44:36 EST


On Thu, 2021-12-09 at 11:29 +0100, Vitaly Kuznetsov wrote:
> Prior to commit 0baedd792713 ("KVM: x86: make Hyper-V PV TLB flush use
> tlb_flush_guest()"), kvm_hv_flush_tlb() was using 'KVM_REQ_TLB_FLUSH |
> KVM_REQUEST_NO_WAKEUP' when making a request to flush TLBs on other vCPUs
> and KVM_REQ_TLB_FLUSH is/was defined as:
>
> (0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
>
> so KVM_REQUEST_WAIT was lost. Hyper-V TLFS, however, requires that
> "This call guarantees that by the time control returns back to the
> caller, the observable effects of all flushes on the specified virtual
> processors have occurred." and without KVM_REQUEST_WAIT there's a small
> chance that the vCPU making the TLB flush will resume running before
> all IPIs get delivered to other vCPUs and a stale mapping can get read
> there.
>
> Fix the issue by adding KVM_REQUEST_WAIT flag to KVM_REQ_TLB_FLUSH_GUEST:
> kvm_hv_flush_tlb() is the sole caller which uses it for
> kvm_make_all_cpus_request()/kvm_make_vcpus_request_mask() where
> KVM_REQUEST_WAIT makes a difference.
>
> Cc: stable@xxxxxxxxxx
> Fixes: 0baedd792713 ("KVM: x86: make Hyper-V PV TLB flush use tlb_flush_guest()")
> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> ---
> - Note, the issue was found by code inspection. Sporadic crashes of
> big Windows guests using Hyper-V TLB flush enlightenment were reported
> but I have no proof that these crashes are anyhow related.
> ---
> arch/x86/include/asm/kvm_host.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index e41ad1ead721..8afb21c8a64f 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -97,7 +97,7 @@
> KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> #define KVM_REQ_TLB_FLUSH_CURRENT KVM_ARCH_REQ(26)
> #define KVM_REQ_TLB_FLUSH_GUEST \
> - KVM_ARCH_REQ_FLAGS(27, KVM_REQUEST_NO_WAKEUP)
> + KVM_ARCH_REQ_FLAGS(27, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
> #define KVM_REQ_APF_READY KVM_ARCH_REQ(28)
> #define KVM_REQ_MSR_FILTER_CHANGED KVM_ARCH_REQ(29)
> #define KVM_REQ_UPDATE_CPU_DIRTY_LOGGING \

Reviewed-by: Maxim Levitsky<mlevitsk@xxxxxxxxxx>

I wonder if that will fix random and rare windows crashes I have seen
when I run a HV enabed VM nested. In nesting scenario, such races
are much more likely to happen.

Best regards,
Maxim Levitsky