Re: [PATCH] KVM: Add separate helper for putting borrowed reference to kvm
From: Sean Christopherson
Date: Tue Nov 26 2019 - 12:14:18 EST
On Tue, Nov 26, 2019 at 01:44:14PM -0300, Leonardo Bras wrote:
> On Mon, 2019-10-21 at 15:58 -0700, Sean Christopherson wrote:
...
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 67ef3f2e19e8..b8534c6b8cf6 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -772,6 +772,18 @@ void kvm_put_kvm(struct kvm *kvm)
> > }
> > EXPORT_SYMBOL_GPL(kvm_put_kvm);
> >
> > +/*
> > + * Used to put a reference that was taken on behalf of an object associated
> > + * with a user-visible file descriptor, e.g. a vcpu or device, if installation
> > + * of the new file descriptor fails and the reference cannot be transferred to
> > + * its final owner. In such cases, the caller is still actively using @kvm and
> > + * will fail miserably if the refcount unexpectedly hits zero.
> > + */
> > +void kvm_put_kvm_no_destroy(struct kvm *kvm)
> > +{
> > + WARN_ON(refcount_dec_and_test(&kvm->users_count));
> > +}
> > +EXPORT_SYMBOL_GPL(kvm_put_kvm_no_destroy);
> >
> > static int kvm_vm_release(struct inode *inode, struct file *filp)
> > {
> > @@ -2679,7 +2691,7 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm
> > *kvm, u32 id)
> > kvm_get_kvm(kvm);
> > r = create_vcpu_fd(vcpu);
> > if (r < 0) {
> > - kvm_put_kvm(kvm);
> > + kvm_put_kvm_no_destroy(kvm);
> > goto unlock_vcpu_destroy;
> > }
> >
> > @@ -3117,7 +3129,7 @@ static int kvm_ioctl_create_device(struct kvm
> > *kvm,
> > kvm_get_kvm(kvm);
> > ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR
> > | O_CLOEXEC);
> > if (ret < 0) {
> > - kvm_put_kvm(kvm);
> > + kvm_put_kvm_no_destroy(kvm);
> > mutex_lock(&kvm->lock);
> > list_del(&dev->vm_node);
> > mutex_unlock(&kvm->lock);
>
> Hello,
>
> I see what are you solving here, but would not this behavior cause the
> refcount to reach negative values?
>
> If so, is not there a problem? I mean, in some archs (powerpc included)
> refcount_dec_and_test() will decrement and then test if the value is
> equal 0. If we ever reach a negative value, this will cause that memory
> to never be released.
>
> An example is that refcount_dec_and_test(), on other archs than x86,
> will call atomic_dec_and_test(), which on include/linux/atomic-
> fallback.h will do:
>
> return atomic_dec_return(v) == 0;
>
> To change this behavior, it would mean change the whole atomic_*_test
> behavior, or do a copy function in order to change this '== 0' to
> '<= 0'.
>
> Does it make sense? Do you need any help on this?
I don't think so. refcount_dec_and_test() will WARN on an underflow when
the kernel is built with CONFIG_REFCOUNT_FULL=y. I see no value in
duplicating those sanity checks in KVM.
This new helper and WARN is to explicitly catch @users_count unexpectedly
hitting zero, which is orthogonal to an underflow (although odds are good
that a bug that triggers the WARN in kvm_put_kvm_no_destroy() will also
lead to an underflow). Leaking the memory is deliberate as the alternative
is a guaranteed use-after-free, i.e. kvm_put_kvm_no_destroy() is intended
to be used when users_count is guaranteed to be valid after it is
decremented.