Re: [PATCH V2 7/8] vfio/pci: Support dynamic MSI-x

From: Alex Williamson
Date: Mon Apr 03 2023 - 16:23:45 EST


On Mon, 3 Apr 2023 10:31:23 -0700
Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote:

> Hi Alex,
>
> On 3/31/2023 3:24 PM, Alex Williamson wrote:
> > On Fri, 31 Mar 2023 10:49:16 -0700
> > Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote:
> >> On 3/30/2023 3:42 PM, Alex Williamson wrote:
> >>> On Thu, 30 Mar 2023 16:40:50 -0600
> >>> Alex Williamson <alex.williamson@xxxxxxxxxx> wrote:
> >>>
> >>>> On Tue, 28 Mar 2023 14:53:34 -0700
> >>>> Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote:
> >>>>
>
> ...
>
> >>>>> + msix_map.index = vector;
> >>>>> + msix_map.virq = irq;
> >>>>> + pci_msix_free_irq(pdev, msix_map);
> >>>>> + }
> >>>>> + vfio_pci_memory_unlock_and_restore(vdev, cmd);
> >>>>> out_put_eventfd_ctx:
> >>>>> eventfd_ctx_put(trigger);
> >>>>> out_free_name:
> >>>>> kfree(ctx->name);
> >>>>> ctx->name = NULL;
> >>>>> +out_free_ctx:
> >>>>> + if (allow_dyn_alloc && new_ctx)
> >>>>> + vfio_irq_ctx_free(vdev, ctx, vector);
> >>>>> return ret;
> >>>>> }
> >>>>>
> >>>>
> >>>> Do we really need the new_ctx test in the above cases? Thanks,
> >>
> >> new_ctx is not required for correctness but instead is used to keep
> >> the code symmetric.
> >> Specifically, if the user enables MSI-X without providing triggers and
> >> then later assign triggers then an error path without new_ctx would unwind
> >> more than done in this function, it would free the context that
> >> was allocated within vfio_msi_enable().
> >
> > Seems like we already have that asymmetry, if a trigger is unset we'll
> > free the ctx allocated by vfio_msi_enable(). Tracking which are
>
> Apologies, but could you please elaborate on where the asymmetry is? I am
> not able to see a flow in this solution where the ctx allocated by
> vfio_msi_enable() is freed if the trigger is unset.

The user first calls SET_IRQS to enable MSI-X with some number of
vectors with (potentially) an eventfd for each vector. The user later
calls SET_IRQS passing a -1 eventfd for one or more of the vectors with
an eventfd initialized in the prior step. Given that we find the ctx,
the ctx has a trigger, and assuming dynamic allocation is supported, the
ctx is freed and vfio_msi_set_vector_signal() returns w/o allocating a
new ctx. We've de-allocated both the irq and context initialized from
vfio_msi_enable().

> > allocated where is unnecessarily complex, how about a policy that
>
> I do not see this as tracking where allocations are made. Instead I
> see it as containing/compartmentalizing state changes with the goal of
> making the code easier to understand and maintain. Specifically, new_ctx
> is used so that if vfio_msi_set_vector_signal() fails, the state
> before and after vfio_msi_set_vector_signal() will be the same.

That's not really possible given how we teardown the existing ctx
before configuring the new one and unwind to disable contexts in
vfio_msi_set_block()

> I do agree that it makes vfio_msi_set_vector_signal() more complex
> and I can remove new_ctx if you find that this is unnecessary after
> considering the motivations behind its use.

If the goal is to allow the user to swap one eventfd for another, where
the result will always be the new eventfd on success or the old eventfd
on error, I don't see that this code does that, or that we've ever
attempted to make such a guarantee. If the ioctl errors, I think the
eventfds are generally deconfigured. We certainly have the unwind code
that we discussed earlier that deconfigures all the vectors previously
touched in the loop (which seems to be another path where we could
de-allocate from the set of initial ctxs).

> > devices supporting vdev->has_dyn_msix only ever have active contexts
> > allocated? Thanks,
>
> What do you see as an "active context"? A policy that is currently enforced
> is that an allocated context always has an allocated interrupt associated
> with it. I do not see how this could be expanded to also require an
> enabled interrupt because interrupt enabling requires a trigger that
> may not be available.

A context is essentially meant to track a trigger, ie. an eventfd
provided by the user. In the static case all the irqs are necessarily
pre-allocated, therefore we had no reason to consider a dynamic array
for the contexts. However, a given context is really only "active" if
it has a trigger, otherwise it's just a placeholder. When the
placeholder is filled by an eventfd, the pre-allocated irq is enabled.

This proposal seems to be a hybrid approach, pre-allocating some
initial set of irqs and contexts and expecting the differentiation to
occur only when new vectors are added, though we have some disagreement
about this per above. Unfortunately I don't see an API to enable MSI-X
without some vectors, so some pre-allocation of irqs seems to be
required regardless.

But if non-active contexts were only placeholders in the pre-dynamic
world and we now manage them via a dynamic array, why is there any
pre-allocation of contexts without knowing the nature of the eventfd to
fill it? We could have more commonality between cases if contexts are
always dynamically allocated, which might simplify differentiation of
the has_dyn_msix cases largely to wrappers allocating and freeing irqs.
Thanks,

Alex