Re: [PATCH] KVM: x86/MMU: Zap all when removing memslot if VM has assigned device
From: Alex Williamson
Date: Thu Aug 15 2019 - 15:42:42 EST
On Thu, 15 Aug 2019 08:12:28 -0700
Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
> Alex Williamson reported regressions with device assignment when KVM
> changed its memslot removal logic to zap only the SPTEs for the memslot
> being removed. The source of the bug is unknown at this time, and root
> causing the issue will likely be a slow process. In the short term, fix
> the regression by zapping all SPTEs when removing a memslot from a VM
> with assigned device(s).
>
> Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
> Reported-by: Alex Willamson <alex.williamson@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> ---
>
> An alternative idea to a full revert. I assume this would be easy to
> backport, and also easy to revert or quirk depending on where the bug
> is hiding.
>
> arch/x86/kvm/mmu.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 8f72526e2f68..358b93882ac6 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -5659,6 +5659,17 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
> bool flush;
> gfn_t gfn;
>
> + /*
> + * Zapping only the removed memslot introduced regressions for VMs with
> + * assigned devices. It is unknown what piece of code is buggy. Until
> + * the source of the bug is identified, zap everything if the VM has an
> + * assigned device.
> + */
> + if (kvm_arch_has_assigned_device(kvm)) {
> + kvm_mmu_zap_all(kvm);
> + return;
> + }
> +
> spin_lock(&kvm->mmu_lock);
>
> if (list_empty(&kvm->arch.active_mmu_pages))
Though if we want to zoom in a little further, the patch below seems to
work. Both versions of these perhaps just highlight that we don't
really know why the original code doesn't work with device assignment,
whether it's something special about GPU mapping, or if it hints that
there's something more generally wrong and difficult to trigger.
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 24843cf49579..3956b5844479 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5670,7 +5670,8 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
gfn = slot->base_gfn + i;
for_each_valid_sp(kvm, sp, gfn) {
- if (sp->gfn != gfn)
+ if (sp->gfn != gfn &&
+ !kvm_arch_has_assigned_device(kvm))
continue;
kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);