Re: [PATCH v3 6/6] hisi_acc_vfio_pci: Add support for VFIO live migration

From: Jason Gunthorpe
Date: Thu Sep 16 2021 - 09:58:40 EST


On Wed, Sep 15, 2021 at 01:28:47PM +0000, Shameerali Kolothum Thodi wrote:
>
>
> > From: Jason Gunthorpe [mailto:jgg@xxxxxxxxxx]
> > Sent: 15 September 2021 14:08
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> > Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > linux-crypto@xxxxxxxxxxxxxxx; alex.williamson@xxxxxxxxxx;
> > mgurtovoy@xxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; liulongfang
> > <liulongfang@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> > Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>; Wangzhou (B)
> > <wangzhou1@xxxxxxxxxxxxx>
> > Subject: Re: [PATCH v3 6/6] hisi_acc_vfio_pci: Add support for VFIO live
> > migration
> >
> > On Wed, Sep 15, 2021 at 10:50:37AM +0100, Shameer Kolothum wrote:
> > > +/*
> > > + * HiSilicon ACC VF dev MMIO space contains both the functional register
> > > + * space and the migration control register space. We hide the migration
> > > + * control space from the Guest. But to successfully complete the live
> > > + * migration, we still need access to the functional MMIO space assigned
> > > + * to the Guest. To avoid any potential security issues, we need to be
> > > + * careful not to access this region while the Guest vCPUs are running.
> > > + *
> > > + * Hence check the device state before we map the region.
> > > + */
> >
> > The prior patch prevents mapping this area into the guest at all,
> > right?
>
> That’s right. It will prevent Guest from mapping this area.
>
> > So why the comment and logic? If the MMIO area isn't mapped then there
> > is nothing to do, right?
> >
> > The only risk is P2P transactions from devices in the same IOMMU
> > group, and you might do well to mitigate that by asserting that the
> > device is in a singleton IOMMU group?
>
> This was added as an extra protection. I will add the singleton check instead.
>
> > > +static int hisi_acc_vfio_pci_init(struct vfio_pci_core_device *vdev)
> > > +{
> > > + struct acc_vf_migration *acc_vf_dev;
> > > + struct pci_dev *pdev = vdev->pdev;
> > > + struct pci_dev *pf_dev, *vf_dev;
> > > + struct hisi_qm *pf_qm;
> > > + int vf_id, ret;
> > > +
> > > + pf_dev = pdev->physfn;
> > > + vf_dev = pdev;
> > > +
> > > + pf_qm = pci_get_drvdata(pf_dev);
> > > + if (!pf_qm) {
> > > + pr_err("HiSi ACC qm driver not loaded\n");
> > > + return -EINVAL;
> > > + }
> >
> > Nope, this is locked wrong and has no lifetime management.
>
> Ok. Holding the device_lock() sufficient here?

You can't hold a hisi_qm pointer with some kind of lifecycle
management of that pointer. device_lock/etc is necessary to call
pci_get_drvdata()

Jason