Re: [PATCH v1] KVM: destruct kvm_io_device while unregistering it from kvm_io_bus
From: Sean Christopherson
Date: Mon Jan 23 2023 - 18:25:31 EST
On Thu, Dec 29, 2022, Wei Wang wrote:
> Current usage of kvm_io_device requires users to destruct it with an extra
> call of kvm_iodevice_destructor after the device gets unregistered from
> the kvm_io_bus. This is not necessary and can cause errors if a user
> forgot to make the extra call.
>
> Simplify the usage by combining kvm_iodevice_destructor into
> kvm_io_bus_unregister_dev. This reduces LOCs a bit for users and can
> avoid the leakage of destructing the device explicitly.
>
> The fix was originally provided by Sean Christopherson.
> Link: https://lore.kernel.org/lkml/DS0PR11MB6373F27D0EE6CD28C784478BDCEC9@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/
> Fixes: 5d3c4c79384a ("KVM: Stop looking for coalesced MMIO zones if the bus is destroyed")
> Cc: stable@xxxxxxxxxxxxxxx
> Reported-by: 柳菁峰 <liujingfeng@xxxxxxxxxxx>
> Signed-off-by: Wei Wang <wei.w.wang@xxxxxxxxx>
> ---
> include/kvm/iodev.h | 6 ------
> virt/kvm/coalesced_mmio.c | 1 -
> virt/kvm/eventfd.c | 1 -
> virt/kvm/kvm_main.c | 24 +++++++++++++++---------
> 4 files changed, 15 insertions(+), 17 deletions(-)
>
> diff --git a/include/kvm/iodev.h b/include/kvm/iodev.h
> index d75fc4365746..56619e33251e 100644
> --- a/include/kvm/iodev.h
> +++ b/include/kvm/iodev.h
> @@ -55,10 +55,4 @@ static inline int kvm_iodevice_write(struct kvm_vcpu *vcpu,
> : -EOPNOTSUPP;
> }
>
> -static inline void kvm_iodevice_destructor(struct kvm_io_device *dev)
> -{
> - if (dev->ops->destructor)
> - dev->ops->destructor(dev);
> -}
> -
> #endif /* __KVM_IODEV_H__ */
> diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
> index 0be80c213f7f..d7135a5e76f8 100644
> --- a/virt/kvm/coalesced_mmio.c
> +++ b/virt/kvm/coalesced_mmio.c
> @@ -195,7 +195,6 @@ int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
> */
> if (r)
> break;
> - kvm_iodevice_destructor(&dev->dev);
The comment lurking above this is now stale.
> }
> }
>
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 2a3ed401ce46..1b277afb545b 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -898,7 +898,6 @@ kvm_deassign_ioeventfd_idx(struct kvm *kvm, enum kvm_bus bus_idx,
> bus = kvm_get_bus(kvm, bus_idx);
> if (bus)
> bus->ioeventfd_count--;
> - ioeventfd_release(p);
> ret = 0;
> break;
> }
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 13e88297f999..582757ebdce6 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5200,6 +5200,12 @@ static struct notifier_block kvm_reboot_notifier = {
> .priority = 0,
> };
>
> +static void kvm_iodevice_destructor(struct kvm_io_device *dev)
> +{
> + if (dev->ops->destructor)
> + dev->ops->destructor(dev);
> +}
> +
> static void kvm_io_bus_destroy(struct kvm_io_bus *bus)
> {
> int i;
> @@ -5423,7 +5429,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
> int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
> struct kvm_io_device *dev)
> {
> - int i, j;
> + int i;
> struct kvm_io_bus *new_bus, *bus;
>
> lockdep_assert_held(&kvm->slots_lock);
> @@ -5453,18 +5459,18 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
> rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
> synchronize_srcu_expedited(&kvm->srcu);
>
> - /* Destroy the old bus _after_ installing the (null) bus. */
> + /*
> + * If (null) bus is installed, destroy the old bus, including all the
> + * attached devices. Otherwise, destroy the caller's device only.
> + */
> if (!new_bus) {
> pr_err("kvm: failed to shrink bus, removing it completely\n");
> - for (j = 0; j < bus->dev_count; j++) {
> - if (j == i)
> - continue;
> - kvm_iodevice_destructor(bus->range[j].dev);
> - }
> + kvm_io_bus_destroy(bus);
> + return -ENOMEM;
Returning an error code is unnecessary if unregister_dev() destroys the bus.
Nothing ultimately consumes the result, e.g. kvm_vm_ioctl_unregister_coalesced_mmio()
intentionally ignores the result other than to bail from the loop, and destroying
the bus means it will immediately bail from the loop anyways.
> }
>
> - kfree(bus);
> - return new_bus ? 0 : -ENOMEM;
> + kvm_iodevice_destructor(dev);
> + return 0;
Unless I'm misreading things, this path leaks "bus".
Given that that intent is to send the fix for stable, that this is as much a
cleanup as it is a bug fix, and that it's not super trivial, I'm inclined to queue
my patch and then land this on top as cleanup.