Re: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

From: Alex Williamson
Date: Thu May 05 2016 - 11:05:26 EST

On Thu, 5 May 2016 12:15:46 +0000
"Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:

> > From: Yongji Xie [mailto:xyjxie@xxxxxxxxxxxxxxxxxx]
> > Sent: Thursday, May 05, 2016 7:43 PM
> >
> > Hi David and Kevin,
> >
> > On 2016/5/5 17:54, David Laight wrote:
> >
> > > From: Tian, Kevin
> > >> Sent: 05 May 2016 10:37
> > > ...
> > >>> Acutually, we are not aimed at accessing MSI-X table from
> > >>> guest. So I think it's safe to passthrough MSI-X table if we
> > >>> can make sure guest kernel would not touch MSI-X table in
> > >>> normal code path such as para-virtualized guest kernel on PPC64.
> > >>>
> > >> Then how do you prevent malicious guest kernel accessing it?
> > > Or a malicious guest driver for an ethernet card setting up
> > > the receive buffer ring to contain a single word entry that
> > > contains the address associated with an MSI-X interrupt and
> > > then using a loopback mode to cause a specific packet be
> > > received that writes the required word through that address.
> > >
> > > Remember the PCIe cycle for an interrupt is a normal memory write
> > > cycle.
> > >
> > > David
> > >
> >
> > If we have enough permission to load a malicious driver or
> > kernel, we can easily break the guest without exposed
> > MSI-X table.
> >
> > I think it should be safe to expose MSI-X table if we can
> > make sure that malicious guest driver/kernel can't use
> > the MSI-X table to break other guest or host. The
> > capability of IRQ remapping could provide this
> > kind of protection.
> >
> With IRQ remapping it doesn't mean you can pass through MSI-X
> structure to guest. I know actual IRQ remapping might be platform
> specific, but at least for Intel VT-d specification, MSI-X entry must
> be configured with a remappable format by host kernel which
> contains an index into IRQ remapping table. The index will find a
> IRQ remapping entry which controls interrupt routing for a specific
> device. If you allow a malicious program random index into MSI-X
> entry of assigned device, the hole is obvious...
> Above might make sense only for a IRQ remapping implementation
> which doesn't rely on extended MSI-X format (e.g. simply based on
> BDF). If that's the case for PPC, then you should build MSI-X
> passthrough based on this fact instead of general IRQ remapping
> enabled or not.

I don't think anyone is expecting that we can expose the MSI-X vector
table to the guest and the guest can make direct use of it. The end
goal here is that the guest on a power system is already
paravirtualized to not program the device MSI-X by directly writing to
the MSI-X vector table. They have hypercalls for this since they
always run virtualized. Therefore a) they never intend to touch the
MSI-X vector table and b) they have sufficient isolation that a guest
can only hurt itself by doing so.

On x86 we don't have a), our method of programming the MSI-X vector
table is to directly write to it. Therefore we will always require QEMU
to place a MemoryRegion over the vector table to intercept those
accesses. However with interrupt remapping, we do have b) on x86, which
means that we don't need to be so strict in disallowing user accesses
to the MSI-X vector table. It's not useful for configuring MSI-X on
the device, but the user should only be able to hurt themselves by
writing it directly. x86 doesn't really get anything out of this
change, but it helps this special case on power pretty significantly
aiui. Thanks,