Re: [PATCH v6 0/9] vITS Migration fixes and reset

From: Christoffer Dall
Date: Tue Oct 31 2017 - 02:44:11 EST


On Mon, Oct 30, 2017 at 08:59:36AM +0100, Auger Eric wrote:
> Hi Christoffer,
>
> On 30/10/2017 07:20, Christoffer Dall wrote:
> > Hi Eric,
> >
> > On Thu, Oct 26, 2017 at 05:23:02PM +0200, Eric Auger wrote:
> >> This series fixes various bugs observed when saving/restoring the
> >> ITS state before the guest writes the ITS registers (on first boot or
> >> after reset/reboot).
> >>
> >> This is a follow up of Wanghaibin's series [1] plus additional
> >> patches following additional code review. It also proposes one
> >> ITS reset implementation.
> >>
> >> Currently, the in-kernel emulated ITS is not reset. After a
> >> reset/reboot, the ITS register values and caches are left
> >> unchanged. Registers may point to some tables in guest memory
> >> which do not exist anymore. If an ITS state backup is initiated
> >> before the guest re-writes the registers, the save fails
> >> because inconsistencies are detected. Also restore of data saved
> >> as such moment is failing.
> >>
> >> Patches [1-4] are fixes of bugs observed during migration at
> >> early guets boot stage.
> >> - handle case where all collection, device and ITT entries are
> >> invalid on restore (which is not an error)
> >> - Check the GITS_BASER<n> valid bit before attempting the save
> >> any table
> >> - Check the GITS_BASER<n> and GITS_CBASER are valid before enabling
> >> the ITS
> >>
> >> Patches [5-9] allow to empty the caches on reset and implement a
> >> new ITS reset IOCTL
> >
> > I applied patches 1-4 to kvmarm/master and included them in a late pull
> > request to kvm.
> >
> > I also took the remaining patches with the adjusted comment in
> > kvmarm/next.
>
> OK Thanks.
> >
> > One question: Don't we have a remaining issue to support saving the
> > collection table even if the device table is inconsistent and vice
> > versa? Are you planning on picking up that work?
>
> Actually to make it clear, patches 1-4 don't fix all failures but we
> discussed at KVM forum that we shouldn't try to fix all of them without
> a proper ITS reset implementation. So indeed even with patches 1-4 you
> can get the migration failing as the save can happen after a reset,
> in-between the collection creation and the PCIe device MSI registration.
> This happens because the caches are not void and the L1 device table
> entries are not valid. In that case the device table save fails and we
> get no chance to save the collection as we abort the save immediately.
> So on restore, guest will not work properly.
>
> But on top of kvmarm/next, caches are void and we shouldn't face this
> issue anymore. So in case the device table save fails, I think it still
> makes sense to return an error.
>
> But this means that the migration of ITS at early guest boot stage only
> is fixed on kvmarm/next.
>

I think you're right.

Thanks,
-Christoffer