Re: [PATCH v2 02/22] PCI: Add API to track PCI devices preserved across Live Update
From: Alex Williamson
Date: Fri Feb 27 2026 - 13:25:30 EST
On Fri, 27 Feb 2026 09:19:28 -0800
David Matlack <dmatlack@xxxxxxxxxx> wrote:
> On Fri, Feb 27, 2026 at 8:32 AM Alex Williamson <alex@xxxxxxxxxxx> wrote:
> >
> > On Thu, 26 Feb 2026 00:28:28 +0000
> > David Matlack <dmatlack@xxxxxxxxxx> wrote:
> > > > > +static int pci_flb_preserve(struct liveupdate_flb_op_args *args)
> > > > > +{
> > > > > + struct pci_dev *dev = NULL;
> > > > > + int max_nr_devices = 0;
> > > > > + struct pci_ser *ser;
> > > > > + unsigned long size;
> > > > > +
> > > > > + for_each_pci_dev(dev)
> > > > > + max_nr_devices++;
> > > >
> > > > How is this protected against hotplug?
> > >
> > > Pranjal raised this as well. Here was my reply:
> > >
> > > . Yes, it's possible to run out space to preserve devices if devices are
> > > . hot-plugged and then preserved. But I think it's better to defer
> > > . handling such a use-case exists (unless you see an obvious simple
> > > . solution). So far I am not seeing preserving hot-plugged devices
> > > . across Live Update as a high priority use-case to support.
> > >
> > > I am going to add a comment here in the next revision to clarify that.
> > > I will also add a comment clarifying why this code doesn't bother to
> > > account for VFs created after this call (preserving VFs are explicitly
> > > disallowed to be preserved in this patch since they require additional
> > > support).
> >
> > TBH, without SR-IOV support and some examples of in-kernel PF
> > preservation in support of vfio-pci VFs, it seems like this only
> > supports a very niche use case.
>
> The intent is to start by supporting a simple use-case and expand to
> more complex scenarios over time, including preserving VFs. Full GPU
> passthrough is common at cloud providers so even non-VF preservation
> support is valuable.
>
> > I expect the majority of vfio-pci
> > devices are VFs and I don't think we want to present a solution where
> > the requirement is to move the PF driver to userspace.
>
> JasonG recommended the upstream support for VF preservation be limited
> to cases where the PF is also bound to VFIO:
>
> https://lore.kernel.org/lkml/20251003120358.GL3195829@xxxxxxxx/
>
> Within Google we have a way to support in-kernel PF drivers but we are
> trying to focus on simpler use-cases first upstream.
>
> > It's not clear,
> > for example, how we can have vfio-pci variant drivers relying on
> > in-kernel channels to PF drivers to support migration in this model.
>
> Agree this still needs to be fleshed out and designed. I think the
> roadmap will be something like:
>
> 1. Get non-VF preservation working end-to-end (device fully preserved
> and doing DMA continuously during Live Update).
> 2. Extend to support VF preservation where the PF is also bound to vfio-pci.
> 3. (Maybe) Extend to support in-kernel PF drivers.
>
> This series is the first step of #1. I have line of sight to how #2
> could work since it's all VFIO.
Without 3, does this become a mainstream feature?
There's obviously a knee jerk reaction that moving PF drivers into
userspace is a means to circumvent the GPL that was evident at LPC,
even if the real reason is "in-kernel is hard".
Related to that, there's also not much difference between a userspace
driver and an out-of-tree driver when it comes to adding in-kernel code
for their specific support requirements. Therefore, unless migration is
entirely accomplished via a shared dmabuf between PF and VF,
orchestrated through userspace, I'm not sure how we get to migration,
making KHO vs migration a binary choice. I have trouble seeing how
that's a viable intermediate step. Thanks,
Alex