Re: [PATCH] KVM: x86/MMU: Zap all when removing memslot if VM has assigned device

From: Paolo Bonzini
Date: Fri Aug 16 2019 - 03:16:48 EST


On 15/08/19 17:12, Sean Christopherson wrote:
> Alex Williamson reported regressions with device assignment when KVM
> changed its memslot removal logic to zap only the SPTEs for the memslot
> being removed. The source of the bug is unknown at this time, and root
> causing the issue will likely be a slow process. In the short term, fix
> the regression by zapping all SPTEs when removing a memslot from a VM
> with assigned device(s).
>
> Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
> Reported-by: Alex Willamson <alex.williamson@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> ---
>
> An alternative idea to a full revert. I assume this would be easy to
> backport, and also easy to revert or quirk depending on where the bug
> is hiding.

We're not sure that it only happens with assigned devices; it's just
that assigned BARs are the memslots that are more likely to be
reprogrammed at boot. So this patch feels unsafe.

Paolo

>
> arch/x86/kvm/mmu.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 8f72526e2f68..358b93882ac6 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5659,6 +5659,17 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
> bool flush;
> gfn_t gfn;
>
> + /*
> + * Zapping only the removed memslot introduced regressions for VMs with
> + * assigned devices. It is unknown what piece of code is buggy. Until
> + * the source of the bug is identified, zap everything if the VM has an
> + * assigned device.
> + */
> + if (kvm_arch_has_assigned_device(kvm)) {
> + kvm_mmu_zap_all(kvm);
> + return;
> + }
> +
> spin_lock(&kvm->mmu_lock);
>
> if (list_empty(&kvm->arch.active_mmu_pages))
>