Re: [PATCH] KVM: Fix error path in kvm_vm_ioctl_create_vcpu() on xa_store() failure

From: Michal Luczaj
Date: Wed Jul 31 2024 - 15:28:07 EST


On 7/31/24 18:18, Sean Christopherson wrote:
> On Wed, Jul 31, 2024, Michal Luczaj wrote:
>> On 7/31/24 15:31, Will Deacon wrote:
>>> On Tue, Jul 30, 2024 at 04:31:08PM -0700, Sean Christopherson wrote:
>>>> On Tue, Jul 30, 2024, Michal Luczaj wrote:
>>>>> On 7/30/24 17:56, Will Deacon wrote:
>>>>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>>>>> index d0788d0a72cc..b80dd8cead8c 100644
>>>>>> --- a/virt/kvm/kvm_main.c
>>>>>> +++ b/virt/kvm/kvm_main.c
>>>>>> @@ -4293,7 +4293,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id)
>>>>>>
>>>>>> if (KVM_BUG_ON(xa_store(&kvm->vcpu_array, vcpu->vcpu_idx, vcpu, 0), kvm)) {
>>>>>> r = -EINVAL;
>>>>>> - goto kvm_put_xa_release;
>>>>>> + goto err_xa_release;
>>>>>> }
>>>>>>
>>>>>> /*
>>>>>> @@ -4310,6 +4310,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id)
>>>>>>
>>>>>> kvm_put_xa_release:
>>>>>> kvm_put_kvm_no_destroy(kvm);
>>>>>> +err_xa_release:
>>>>>> xa_release(&kvm->vcpu_array, vcpu->vcpu_idx);
>>>>>> unlock_vcpu_destroy:
>>>>>> mutex_unlock(&kvm->lock);
>>>>>
>>>>> My bad for neglecting the "impossible" path. Thanks for the fix.
>>>>>
>>>>> I wonder if it's complete. If we really want to consider the possibility of
>>>>> this xa_store() failing, then keeping vCPU fd installed and calling
>>>>> kmem_cache_free(kvm_vcpu_cache, vcpu) on the error path looks wrong.
>>>>
>>>> Yeah, the vCPU is exposed to userspace, freeing its assets will just cause
>>>> different problems. KVM_BUG_ON() will prevent _new_ vCPU ioctl() calls (and kick
>>>> running vCPUs out of the guest), but it doesn't interrupt other CPUs, e.g. if
>>>> userspace is being sneaking and has already invoked a vCPU ioctl(), KVM will hit
>>>> a use-after-free (several of them).
>>>
>>> Damn, yes. Just because we haven't returned the fd yet, doesn't mean
>>> userspace can't make use of it.
>>>
>>>> As Michal alluded to, it should be impossible for xa_store() to fail since KVM
>>>> pre-allocates/reserves memory. Given that, deliberately leaking the vCPU seems
>>>> like the least awful "solution".
>>>
>>> Could we actually just move the xa_store() before the fd creation? I
>>> can't immediately see any issues with that...
>>
>> Hah, please see commit afb2acb2e3a3 :) Long story short: create_vcpu_fd()
>> can legally fail, which must be handled gracefully, which would involve
>> destruction of an already xa_store()ed vCPU, which is racy.
>
> Ya, the basic problem is that we have two ways of publishing the vCPU, fd and
> vcpu_array, with no way of setting both atomically. Given that xa_store() should
> never fail, I vote we do the simple thing and deliberately leak the memory.

I agree it's a good idea. So for a failed xa_store(), just drop the goto?