Re: [RFC PATCH v2 00/32] Add live update state preservation
From: David Matlack
Date: Mon Feb 02 2026 - 17:49:33 EST
On 2026-01-28 05:11 PM, Samiullah Khawaja wrote:
> On Wed, Jan 28, 2026 at 11:59 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> > On Tue, Dec 02, 2025 at 11:02:30PM +0000, Samiullah Khawaja wrote:
> > Start off by just making the successor kernel fail to accept any
> > drivers at all because the iommu_domain was preserved. ie restore the
> > domain and set it as the default_domain and then fail in
> > iommu_device_use_default_domain() and related functions.
>
> Right. In this model, the device would remain unusable after
> liveupdate as you cannot do session finish. So the only way to recover
> would be to do a proper reboot. But I think this is a great
> intermediate step.
This will break the new selftest vfio_pci_liveupdate_kexec_test that is
added in the vfio cdev series:
https://lore.kernel.org/kvm/20260129212510.967611-19-dmatlack@xxxxxxxxxx/
https://lore.kernel.org/kvm/20260129212510.967611-23-dmatlack@xxxxxxxxxx/
And having to reboot a machine back to life after every Live Update will
be a painful for everyone working on VFIO, PCI, and IOMMU Live Update
support.
I agree with wanting to do incremental steps, but can we target that the
intial iommufd series supports an e2e scenario that doesn't leave the
host (or its devices) in an unusable state?