Re: [PATCH v4 06/17] PCI: add SIOV and IMS capability detection
From: Jason Gunthorpe
Date: Wed Nov 04 2020 - 08:54:27 EST
On Wed, Nov 04, 2020 at 01:34:08PM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > Sent: Wednesday, November 4, 2020 8:40 PM
> >
> > On Wed, Nov 04, 2020 at 03:41:33AM +0000, Tian, Kevin wrote:
> > > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > > Sent: Tuesday, November 3, 2020 8:44 PM
> > > >
> > > > On Tue, Nov 03, 2020 at 02:49:27AM +0000, Tian, Kevin wrote:
> > > >
> > > > > > There is a missing hypercall to allow the guest to do this on its own,
> > > > > > presumably it will someday be fixed so IMS can work in guests.
> > > > >
> > > > > Hypercall is VMM specific, while IMS cap provides a VMM-agnostic
> > > > > interface so any guest driver (if following the spec) can seamlessly
> > > > > work on all hypervisors.
> > > >
> > > > It is a *VMM* issue, not PCI. Adding a PCI cap to describe a VMM issue
> > > > is architecturally wrong.
> > > >
> > > > IMS *can not work* in any hypervsior without some special
> > > > hypercall. Just block it in the platform code and forget about the PCI
> > > > cap.
> > > >
> > >
> > > It's per-device thing instead of platform thing. If the VMM understands
> > > the IMS format of a specific device and virtualize it to the guest,
> >
> > Please no! Adding device specific emulation is just going down deeper
> > into this bad architecture.
> >
> > Interrupts is a platform issue. Using emulation of MSI to dynamically
>
> Interrupt controller is a platform issue. Interrupt source is about device.
The interrupt controller is responsible to create an addr/data pair
for an interrupt message. It sets the message format and ensures it
routes to the proper CPU interrupt handler. Everything about the
addr/data pair is owned by the platform interrupt controller.
Devices do not create interrupts. They only trigger the addr/data pair
the platform gives them.
> > insert vectors to a VM was a reasonable, but hacky thing. Now it needs
> > proper platform support.
>
> why is MSI emulation a hacky thing? isn't it defined by PCISIG? I guess
> that I must misunderstand your real point here...
It means the interrupt controller in the VM's platform is a fiction,
the addr/data pairs it creates are not real.
A PCI device assigned to a VM is supposed to be fully contained by the
IOMMU, interrupts included, so there is no reason to do MSI emulation
if the VM's interrupt controller is aware of what addr/data pairs it
can use with the device - eg by getting them through a hypercall. This
is much cleaner and supports things like IMS
Trying to do IMS emulation is nutz, the entire point of IMS is the
device can do what it likes, and emulating that is not going to
feasible. For instance go read the discussion I had with Thomas how a
object-centric device would manage interrupts.
Jason