Re: [patch V2 24/46] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI

From: Derrick, Jonathan
Date: Wed Sep 30 2020 - 09:08:32 EST


+Megha

On Wed, 2020-09-30 at 09:57 -0300, Jason Gunthorpe wrote:
> On Wed, Sep 30, 2020 at 12:45:30PM +0000, Derrick, Jonathan wrote:
> > Hi Jason
> >
> > On Mon, 2020-08-31 at 11:39 -0300, Jason Gunthorpe wrote:
> > > On Wed, Aug 26, 2020 at 01:16:52PM +0200, Thomas Gleixner wrote:
> > > > From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > >
> > > > Devices on the VMD bus use their own MSI irq domain, but it is not
> > > > distinguishable from regular PCI/MSI irq domains. This is required
> > > > to exclude VMD devices from getting the irq domain pointer set by
> > > > interrupt remapping.
> > > >
> > > > Override the default bus token.
> > > >
> > > > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > > Acked-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> > > > drivers/pci/controller/vmd.c | 6 ++++++
> > > > 1 file changed, 6 insertions(+)
> > > >
> > > > +++ b/drivers/pci/controller/vmd.c
> > > > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_
> > > > return -ENODEV;
> > > > }
> > > >
> > > > + /*
> > > > + * Override the irq domain bus token so the domain can be distinguished
> > > > + * from a regular PCI/MSI domain.
> > > > + */
> > > > + irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> > > > +
> > >
> > > Having the non-transparent-bridge hold a MSI table and
> > > multiplex/de-multiplex IRQs looks like another good use case for
> > > something close to pci_subdevice_msi_create_irq_domain()?
> > >
> > > If each device could have its own internal MSI-X table programmed
> > > properly things would work alot better. Disable capture/remap of the
> > > MSI range in the NTB.
> > We can disable the capture and remap in newer devices so we don't even
> > need the irq domain.
>
> You'd still need an irq domain, it just comes from
> pci_subdevice_msi_create_irq_domain() instead of internal to this
> driver.
I have this set which disables remapping and bypasses the creation of
the IRQ domain:
https://patchwork.ozlabs.org/project/linux-pci/list/?series=192936

It allows the end-devices like NVMe to directly manager their own
interrupts and eliminates the VMD interrupt completely. The only quirk
was that kernel has to program IRTE with the VMD device ID as it still
remaps Requester-ID from subdevices.

>
> > I would however like to determine if the MSI data bits could be used
> > for individual devices to have the IRQ domain subsystem demultiplex the
> > virq from that and eliminate the IRQ list iteration.
>
> Yes, exactly. This new pci_subdevice_msi_create_irq_domain() creates
> *proper* fully functional interrupts, no need for any IRQ handler in
> this driver.
>
> > A concern I have about that scheme is virtualization as I think those
> > bits are used to route to the virtual vector.
>
> It should be fine with this patch series, consult with Megha how
> virtualization is working with IDXD/etc
>
> Jason