Re: [PATCH 1/2] KVM: x86: fix usage of kvm_lock in set_nx_huge_pages()

From: Sean Christopherson
Date: Fri Jan 24 2025 - 15:11:35 EST


On Fri, Jan 24, 2025, Paolo Bonzini wrote:
> Protect the whole function with kvm_lock() so that all accesses to
> nx_hugepage_mitigation_hard_disabled are under the lock; but drop it
> when calling out to the MMU to avoid complex circular locking
> situations such as the following:

...

> To break the deadlock, release kvm_lock while taking kvm->slots_lock, which
> breaks the chain:

Heh, except it's all kinds of broken. IMO, biting the bullet and converting to
an SRCU-protected list is going to be far less work in the long run.

> @@ -7143,16 +7141,19 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
> if (new_val != old_val) {
> struct kvm *kvm;
>
> - mutex_lock(&kvm_lock);
> -
> list_for_each_entry(kvm, &vm_list, vm_list) {

This is unsafe, as vm_list can be modified while kvm_lock is dropped. And
using list_for_each_entry_safe() doesn't help, because the _next_ entry have been
freed.

> + kvm_get_kvm(kvm);

This needs to be:

if (!kvm_get_kvm_safe(kvm))
continue;

because the last reference to the VM could already have been put.

> + mutex_unlock(&kvm_lock);
> +
> mutex_lock(&kvm->slots_lock);
> kvm_mmu_zap_all_fast(kvm);
> mutex_unlock(&kvm->slots_lock);
>
> vhost_task_wake(kvm->arch.nx_huge_page_recovery_thread);

See my bug report on this being a NULL pointer deref.

> +
> + mutex_lock(&kvm_lock);
> + kvm_put_kvm(kvm);

The order is backwards, kvm_put_kvm() needs to be called before acquiring kvm_lock.
If the last reference is put, kvm_put_kvm() => kvm_destroy_vm() will deadlock on
kvm_lock.

> }
> - mutex_unlock(&kvm_lock);
> }