Re: [PATCH V2 7/8] vfio/pci: Support dynamic MSI-x

From: Alex Williamson
Date: Tue Apr 04 2023 - 14:44:28 EST


On Tue, 4 Apr 2023 10:29:14 -0700
Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote:

> Hi Kevin,
>
> On 4/3/2023 8:51 PM, Tian, Kevin wrote:
> >> From: Alex Williamson <alex.williamson@xxxxxxxxxx>
> >> Sent: Tuesday, April 4, 2023 11:19 AM
> >>>
> >>> Thank you very much for your guidance. I will digest this some more and
> >>> see how wrappers could be used. In the mean time while trying to think
> >> how
> >>> to unify this code I do think there is an issue in this patch in that
> >>> the get_cached_msi_msg()/pci_write_msi_msg()
> >>> should not be in an else branch.
> >>>
> >>> Specifically, I think it needs to be:
> >>> if (msix) {
> >>> if (irq == -EINVAL) {
> >>> /* dynamically allocate interrupt */
> >>> }
> >>> get_cached_msi_msg(irq, &msg);
> >>> pci_write_msi_msg(irq, &msg);
> >>> }
> >>
> >> Yes, that's looked wrong to me all along, I think that resolves it.
> >> Thanks,
> >>
> >
> > Do you mind elaborating why this change is required? I thought
> > pci_msix_alloc_irq_at() will compose a new msi message to write
> > hence no need to get cached value again in that case...
>
> With this change an interrupt allocated via pci_msix_alloc_irq_at()
> is treated the same as an interrupt allocated via pci_alloc_irq_vectors().
>
> get_cached_msi_msg()/pci_write_msi_msg() is currently called for
> every allocated interrupt and this snippet intends to maintain
> this behavior.
>
> One flow I considered that made me think this is fixing a bug is
> as follows:
> Scenario A (current behavior):
> - host/user enables vectors 0, 1, 2 ,3 ,4
> - kernel allocates all interrupts via pci_alloc_irq_vectors()
> - get_cached_msi_msg()/pci_write_msi_msg() is called for each interrupt

In this scenario, I think the intention is that there's non-zero
time since pci_alloc_irq_vectors() such that a device reset or other
manipulation of the vector table may have occurred, therefore we're
potentially restoring the programming of the vector table with this
get/write.

> Scenario B (this series):
> - host/user enables vector 0
> - kernel allocates interrupt 0 via pci_alloc_irq_vectors()
> - get_cached_msi_msg()/pci_write_msi_msg() is called for interrupt 0
> - host/user enables vector 1
> - kernel allocates interrupt 1 via pci_msix_alloc_irq_at()
> - get_cached_msi_msg()/pci_write_msi_msg() is NOT called for interrupt 1
> /* This seems a bug since host may expect same outcome as in scenario A */
>
> I am not familiar with how the MSI messages are composed though and I surely
> could have gotten this wrong. I would like to learn more after you considered
> the motivation for this change.

I think Kevin has a point, if it's correct that we do this get/write in
order to account for manipulation of the device since we wrote into the
vector table via either pci_alloc_irq_vectors() or
pci_msix_alloc_irq_at(), then it really only makes sense to do that
restore if we haven't allocated the irq and written the vector table
immediately prior. Thanks,

Alex