Re: [PATCH 07/17] vfio/pci: Preserve per-interrupt contexts

From: Alex Williamson
Date: Mon Feb 05 2024 - 17:37:40 EST


On Thu, 1 Feb 2024 20:57:01 -0800
Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote:

> MSI and MSI-X interrupt management for PCI passthrough devices create
> a new per-interrupt context every time an interrupt is allocated,
> freeing it when the interrupt is freed.
>
> The per-interrupt context contains the properties of a particular
> interrupt. Without a property that persists across interrupt allocation
> and free it is acceptable to always create a new per-interrupt context.
>
> INTx interrupt context has a "masked" property that persists across
> allocation and free and thus preserves its interrupt context
> across interrupt allocation and free calls.
>
> MSI and MSI-X interrupts already remain allocated across interrupt
> allocation and free requests, additionally maintaining the
> individual interrupt context is a reflection of this existing
> behavior and matches INTx behavior so that more code can be shared.
>
> An additional benefit is that maintaining interrupt context supports
> a potential future use case of emulated interrupts, where the
> "is this interrupt emulated" is a property that needs to persist
> across allocation and free requests.
>
> Persistent interrupt contexts means that existence of per-interrupt
> context no longer implies a valid trigger, pointers to freed memory
> should be cleared, and a new per-interrupt context cannot be assumed
> needing allocation when an interrupt is allocated.
>
> Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> ---
> Note to maintainers:
> This addition originally formed part of the IMS work below that mostly
> ignored INTx. This work focuses on INTx, MSI, MSI-X where this addition
> is relevant.
> https://lore.kernel.org/lkml/cover.1696609476.git.reinette.chatre@xxxxxxxxx
>
> drivers/vfio/pci/vfio_pci_intrs.c | 26 ++++++++++++++------------
> 1 file changed, 14 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> index 31f73c70fcd2..7ca2b983b66e 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -427,7 +427,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
>
> ctx = vfio_irq_ctx_get(vdev, vector);
>
> - if (ctx) {
> + if (ctx && ctx->trigger) {
> irq_bypass_unregister_producer(&ctx->producer);
> irq = pci_irq_vector(pdev, vector);
> cmd = vfio_pci_memory_lock_and_enable(vdev);
> @@ -435,8 +435,9 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
> vfio_pci_memory_unlock_and_restore(vdev, cmd);
> /* Interrupt stays allocated, will be freed at MSI-X disable. */
> kfree(ctx->name);
> + ctx->name = NULL;

Setting ctx->name = NULL is not strictly necessary and does not match
the INTx code that we're claiming to try to emulate. ctx->name is only
tested immediately after allocation below, otherwise it can be inferred
from ctx->trigger. Thanks,

Alex

> eventfd_ctx_put(ctx->trigger);
> - vfio_irq_ctx_free(vdev, ctx, vector);
> + ctx->trigger = NULL;
> }
>
> if (fd < 0)
> @@ -449,16 +450,17 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
> return irq;
> }
>
> - ctx = vfio_irq_ctx_alloc(vdev, vector);
> - if (!ctx)
> - return -ENOMEM;
> + /* Per-interrupt context remain allocated. */
> + if (!ctx) {
> + ctx = vfio_irq_ctx_alloc(vdev, vector);
> + if (!ctx)
> + return -ENOMEM;
> + }
>
> ctx->name = kasprintf(GFP_KERNEL_ACCOUNT, "vfio-msi%s[%d](%s)",
> msix ? "x" : "", vector, pci_name(pdev));
> - if (!ctx->name) {
> - ret = -ENOMEM;
> - goto out_free_ctx;
> - }
> + if (!ctx->name)
> + return -ENOMEM;
>
> trigger = eventfd_ctx_fdget(fd);
> if (IS_ERR(trigger)) {
> @@ -502,8 +504,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_core_device *vdev,
> eventfd_ctx_put(trigger);
> out_free_name:
> kfree(ctx->name);
> -out_free_ctx:
> - vfio_irq_ctx_free(vdev, ctx, vector);
> + ctx->name = NULL;
> return ret;
> }
>
> @@ -539,6 +540,7 @@ static void vfio_msi_disable(struct vfio_pci_core_device *vdev,
> vfio_virqfd_disable(&ctx->unmask);
> vfio_virqfd_disable(&ctx->mask);
> vfio_msi_set_vector_signal(vdev, i, -1, index);
> + vfio_irq_ctx_free(vdev, ctx, i);
> }
>
> cmd = vfio_pci_memory_lock_and_enable(vdev);
> @@ -694,7 +696,7 @@ static int vfio_pci_set_msi_trigger(struct vfio_pci_core_device *vdev,
>
> for (i = start; i < start + count; i++) {
> ctx = vfio_irq_ctx_get(vdev, i);
> - if (!ctx)
> + if (!ctx || !ctx->trigger)
> continue;
> if (flags & VFIO_IRQ_SET_DATA_NONE) {
> eventfd_signal(ctx->trigger);