Re: [Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to vfio

From: Alex Williamson
Date: Thu May 13 2021 - 14:22:43 EST


On Thu, 13 May 2021 17:52:56 +0000
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx> wrote:

> Hi Alex,
>
> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx]
> > Sent: 13 May 2021 18:04
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> > Cc: Jason Gunthorpe <jgg@xxxxxxxxxx>; liulongfang
> > <liulongfang@xxxxxxxxxx>; cohuck@xxxxxxxxxx;
> > linux-kernel@xxxxxxxxxxxxxxx; linuxarm@xxxxxxxxxxxxx
> > Subject: [Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to
> > vfio
> >
> > On Thu, 13 May 2021 15:49:25 +0000
> > Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> > wrote:
> >
> > > > -----Original Message-----
> > > > From: Jason Gunthorpe [mailto:jgg@xxxxxxxxxx]
> > > > Sent: 13 May 2021 14:44
> > > > To: liulongfang <liulongfang@xxxxxxxxxx>
> > > > Cc: Alex Williamson <alex.williamson@xxxxxxxxxx>; cohuck@xxxxxxxxxx;
> > > > linux-kernel@xxxxxxxxxxxxxxx; linuxarm@xxxxxxxxxxxxx
> > > > Subject: [Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to
> > > > vfio
> > > >
> > > > On Thu, May 13, 2021 at 10:08:28AM +0800, liulongfang wrote:
> > > > > On 2021/5/12 20:10, Jason Gunthorpe wrote:
> > > > > > On Wed, May 12, 2021 at 04:39:43PM +0800, liulongfang wrote:
> > > > > >
> > > > > >> Therefore, this method of limiting the length of the BAR
> > > > > >> configuration space can prevent unsafe operations of the memory.
> > > > > >
> > > > > > The issue is DMA controlled by the guest accessing the secure BAR
> > > > > > area, not the guest CPU.
> > > > > >
> > > > > > Jason
> > > > > > .
> > > > > >
> > > > > This secure BAR area is not presented to the Guest,
> > > > > which makes it impossible for the Guest to obtain the secure BAR area
> > > > > when establishing the DMA mapping of the configuration space.
> > > > > If the DMA controller accesses the secure BAR area, the access will
> > > > > be blocked by the SMMU.
> > > >
> > > > There are scenarios where this is not true.
> > > >
> > > > At a minimum the mdev driver should refuse to work in those cases.
> > > >
> > >
> > > Hi,
> > >
> > > I think the idea here is not a generic solution, but a quirk for this specific dev.
> > >
> > > Something like,
> > >
> > > --- a/drivers/vfio/pci/vfio_pci.c
> > > +++ b/drivers/vfio/pci/vfio_pci.c
> > > @@ -866,7 +866,12 @@ static long vfio_pci_ioctl(struct vfio_device
> > *core_vdev,
> > > break;
> > > case VFIO_PCI_BAR0_REGION_INDEX ...
> > VFIO_PCI_BAR5_REGION_INDEX:
> > > info.offset =
> > VFIO_PCI_INDEX_TO_OFFSET(info.index);
> > > - info.size = pci_resource_len(pdev, info.index);
> > > +
> > > + if (check_hisi_acc_quirk(pdev, info))
> > > + info.size = new_size;// BAR is limited
> > without migration region.
> > > + else
> > > + info.size = pci_resource_len(pdev,
> > info.index);
> > > +
> > > if (!info.size) {
> > > info.flags = 0;
> > > break;
> > >
> > > Is this an acceptable/workable solution here?
> >
> > As Jason says, this only restricts CPU access to the BAR, the issue is
> > DMA access. As the hardware vendor you may be able to guarantee that
> > a DMA transaction generated by the device targeting the remainder of
> > the BAR will always go upstream, but can you guarantee the routing
> > between the device and the SMMU? For instance if this device can be
> > implemented as a plugin card, then it can be installed into a
> > downstream port that may not support ACS. That downstream port may
> > implement request redirection allowing the transaction to reflect back
> > to the device without IOMMU translation. At that point the userspace
> > driver can target the kernel driver half of the BAR and potentially
> > expose a security risk. Thanks,
>
> The ACC devices on this platform are not pluggable devices. They are exposed
> as integrated endpoint devices. So I am not sure the above concern is valid in this
> case.
>
> I had a look at the userspace driver approach you suggested. But unfortunately
> the migration state change for the vf has to check some of the pf registers for
> confirming the state. So even if we move the implementation to Qemu, we
> still may have to use the migration uAPI to access the pf device registers.
>
> Since the devices we are concerned here are all integrated endpoints and if the
> above quirk is an acceptable one, then we can use the uAPI as done in this
> series without overly complicating things here.

If you expect this device to appear only as an integrated endpoint, then
I think Jason's suggestion above is correct. Your driver that supports
migration can refuse to load for devices there the topology is other
than expected and you're effectively guaranteeing DMA isolation of the
user and in-kernel drivers by hardware DMA semantics and topology.

Requiring access to the PF to support the migration protocol also
suggests that an in-kernel driver to support migration is our best
option. Thanks,

Alex