RE: [PATCH v3 0/6] vfio/hisilicon: add acc live migration driver

From: Shameerali Kolothum Thodi
Date: Thu Sep 30 2021 - 02:34:43 EST




> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> Sent: 30 September 2021 01:42
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>;
> kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> linux-crypto@xxxxxxxxxxxxxxx
> Cc: alex.williamson@xxxxxxxxxx; jgg@xxxxxxxxxx; mgurtovoy@xxxxxxxxxx;
> Linuxarm <linuxarm@xxxxxxxxxx>; liulongfang <liulongfang@xxxxxxxxxx>;
> Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>; Jonathan Cameron
> <jonathan.cameron@xxxxxxxxxx>; Wangzhou (B)
> <wangzhou1@xxxxxxxxxxxxx>; He, Shaopeng <shaopeng.he@xxxxxxxxx>; Zhao,
> Yan Y <yan.y.zhao@xxxxxxxxx>
> Subject: RE: [PATCH v3 0/6] vfio/hisilicon: add acc live migration driver
>
> > From: Shameerali Kolothum Thodi
> > <shameerali.kolothum.thodi@xxxxxxxxxx>
> >
> > > From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> > > Sent: 29 September 2021 10:06
> > >
> > > > From: Shameerali Kolothum Thodi
> > > > <shameerali.kolothum.thodi@xxxxxxxxxx>
> > > >
> > > > Hi Kevin,
> > > >
> > > > > From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> > > > > Sent: 29 September 2021 04:58
> > > > >
> > > > > Hi, Shameer,
> > > > >
> > > > > > From: Shameer Kolothum <shameerali.kolothum.thodi@xxxxxxxxxx>
> > > > > > Sent: Wednesday, September 15, 2021 5:51 PM
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Thanks to the introduction of vfio_pci_core subsystem framework[0],
> > > > > > now it is possible to provide vendor specific functionality to
> > > > > > vfio pci devices. This series attempts to add vfio live migration
> > > > > > support for HiSilicon ACC VF devices based on the new framework.
> > > > > >
> > > > > > HiSilicon ACC VF device MMIO space includes both the functional
> > > > > > register space and migration control register space. As discussed
> > > > > > in RFCv1[1], this may create security issues as these regions get
> > > > > > shared between the Guest driver and the migration driver.
> > > > > > Based on the feedback, we tried to address those concerns in
> > > > > > this version.
> > > > >
> > > > > This series doesn't mention anything related to dirty page tracking.
> > > > > Are you rely on Keqian's series for utilizing hardware iommu dirty
> > > > > bit (e.g. SMMU HTTU)?
> > > >
> > > > Yes, this doesn't have dirty page tracking and the plan is to make use of
> > > > Keqian's SMMU HTTU work to improve performance. We have done
> basic
> > > > sanity testing with those patches.
> > > >
> > >
> > > Do you plan to support migration w/o HTTU as the fallback option?
> > > Generally one would expect the basic functionality ready before talking
> > > about optimization.
> >
> > Yes, the plan is to get the basic live migration working and then we can
> > optimize
> > it with SMMU HTTU when it is available.
>
> The interesting thing is that w/o HTTU vfio will just report every pinned
> page as dirty, i.e. the entire guest memory is dirty. This completely kills
> the benefit of precopy phase since Qemu still needs to transfer the entire
> guest memory in the stop-copy phase. This is not a 'working' model for
> live migration.
>
> So it needs to be clear whether HTTU is really an optimization or
> a hard functional-requirement for migrating such device. If the latter
> the migration region info is not a nice-to-have thing.

Yes, agree that we have to transfer the entire Guest memory in this case.
But don't think that is a killer here as we would still like to have the
basic live migration enabled on these platforms and can be used
where the constraints of memory transfer is acceptable.

> btw the fallback option that I raised earlier is more like some software
> mitigation for collecting dirty pages, e.g. analyzing the ring descriptors
> to build software-tracked dirty info by mediating the cmd portal
> (which requires dynamically unmapping cmd portal from the fast-path
> to enable mediation). We are looking into this option for some platform
> which lacks of IOMMU dirty bit support.

Interesting. Is there anything available publicly so that we can take a look?

Thanks,
Shameer