Re: [PATCH 03/17] vfio/pci: Consistently acquire mutex for interrupt management

From: Alex Williamson
Date: Mon Feb 05 2024 - 17:36:31 EST


On Thu, 1 Feb 2024 20:56:57 -0800
Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote:

> vfio_pci_set_irqs_ioctl() is the entrypoint for interrupt management
> via the VFIO_DEVICE_SET_IRQS ioctl(). The igate mutex is obtained
> before calling vfio_pci_set_irqs_ioctl() for management of all interrupt
> types to protect against concurrent changes to the eventfds associated
> with device request notification and error interrupts.
>
> The igate mutex is not acquired consistently. The mutex is always
> (for all interrupt types) acquired from within vfio_pci_ioctl_set_irqs()
> before calling vfio_pci_set_irqs_ioctl(), but vfio_pci_set_irqs_ioctl() is
> called via vfio_pci_core_disable() without the mutex held. The latter
> is expected to be correct if the code flow can be guaranteed that
> the provided interrupt type is not a device request notification or error
> interrupt.

The latter is correct because it's always a physical interrupt type
(INTx/MSI/MSIX), vdev->irq_type dictates this, and the interrupt code
prevents the handler from being called after the interrupt is disabled.
It's intentional that we don't acquire igate here since we only need to
prevent a race with concurrent user access, which cannot occur in the
fd release path. The igate mutex is acquired consistently, where it's
required.

It would be more forthcoming to describe that potential future emulated
device interrupts don't make the same guarantees, but if that's true,
why can't they?

> Move igate mutex acquire and release into vfio_pci_set_irqs_ioctl()
> to make the locking consistent irrespective of interrupt type.
> This is one step closer to contain the interrupt management locking
> internals within the interrupt management code so that the VFIO PCI
> core can trigger management of the eventfds associated with device
> request notification and error interrupts without needing to access
> and manipulate VFIO interrupt management locks and data.

If all we want to do is move the mutex into vfio_pci_intr.c then we
could rename to __vfio_pci_set_irqs_ioctl() and create a wrapper around
it that grabs the mutex. The disable path could use the lockless
version and we wouldn't need to clutter the exit path unlocking the
mutex as done below. Thanks,

Alex

> Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> ---
> Note to maintainers:
> Originally formed part of the IMS submission below, but is not
> specific to IMS.
> https://lore.kernel.org/lkml/cover.1696609476.git.reinette.chatre@xxxxxxxxx
>
> drivers/vfio/pci/vfio_pci_core.c | 3 ---
> drivers/vfio/pci/vfio_pci_intrs.c | 10 ++++++++--
> 2 files changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 1cbc990d42e0..d2847ca2f0cb 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1214,12 +1214,9 @@ static int vfio_pci_ioctl_set_irqs(struct vfio_pci_core_device *vdev,
> return PTR_ERR(data);
> }
>
> - mutex_lock(&vdev->igate);
> -
> ret = vfio_pci_set_irqs_ioctl(vdev, hdr.flags, hdr.index, hdr.start,
> hdr.count, data);
>
> - mutex_unlock(&vdev->igate);
> kfree(data);
>
> return ret;
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> index 69ab11863282..97a3bb22b186 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -793,7 +793,9 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
> int (*func)(struct vfio_pci_core_device *vdev, unsigned int index,
> unsigned int start, unsigned int count, uint32_t flags,
> void *data) = NULL;
> + int ret = -ENOTTY;
>
> + mutex_lock(&vdev->igate);
> switch (index) {
> case VFIO_PCI_INTX_IRQ_INDEX:
> switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
> @@ -838,7 +840,11 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
> }
>
> if (!func)
> - return -ENOTTY;
> + goto out_unlock;
> +
> + ret = func(vdev, index, start, count, flags, data);
> +out_unlock:
> + mutex_unlock(&vdev->igate);
> + return ret;
>
> - return func(vdev, index, start, count, flags, data);
> }