Re: [PATCH v2 02/22] PCI: Add API to track PCI devices preserved across Live Update

From: David Matlack

Date: Fri Feb 27 2026 - 17:38:45 EST

On Fri, Feb 27, 2026 at 2:23 PM Alex Williamson <alex@xxxxxxxxxxx> wrote:
>
> On Fri, 27 Feb 2026 14:19:45 -0800
> David Matlack <dmatlack@xxxxxxxxxx> wrote:
>
> > On Fri, Feb 27, 2026 at 10:25 AM Alex Williamson <alex@xxxxxxxxxxx> wrote:
> > >
> > > On Fri, 27 Feb 2026 09:19:28 -0800
> > > David Matlack <dmatlack@xxxxxxxxxx> wrote:
> > >
> > > > On Fri, Feb 27, 2026 at 8:32 AM Alex Williamson <alex@xxxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, 26 Feb 2026 00:28:28 +0000
> > > > > David Matlack <dmatlack@xxxxxxxxxx> wrote:
> > > > > > > > +static int pci_flb_preserve(struct liveupdate_flb_op_args *args)
> > > > > > > > +{
> > > > > > > > + struct pci_dev *dev = NULL;
> > > > > > > > + int max_nr_devices = 0;
> > > > > > > > + struct pci_ser *ser;
> > > > > > > > + unsigned long size;
> > > > > > > > +
> > > > > > > > + for_each_pci_dev(dev)
> > > > > > > > + max_nr_devices++;
> > > > > > >
> > > > > > > How is this protected against hotplug?
> > > > > >
> > > > > > Pranjal raised this as well. Here was my reply:
> > > > > >
> > > > > > . Yes, it's possible to run out space to preserve devices if devices are
> > > > > > . hot-plugged and then preserved. But I think it's better to defer
> > > > > > . handling such a use-case exists (unless you see an obvious simple
> > > > > > . solution). So far I am not seeing preserving hot-plugged devices
> > > > > > . across Live Update as a high priority use-case to support.
> > > > > >
> > > > > > I am going to add a comment here in the next revision to clarify that.
> > > > > > I will also add a comment clarifying why this code doesn't bother to
> > > > > > account for VFs created after this call (preserving VFs are explicitly
> > > > > > disallowed to be preserved in this patch since they require additional
> > > > > > support).
> > > > >
> > > > > TBH, without SR-IOV support and some examples of in-kernel PF
> > > > > preservation in support of vfio-pci VFs, it seems like this only
> > > > > supports a very niche use case.
> > > >
> > > > The intent is to start by supporting a simple use-case and expand to
> > > > more complex scenarios over time, including preserving VFs. Full GPU
> > > > passthrough is common at cloud providers so even non-VF preservation
> > > > support is valuable.
> > > >
> > > > > I expect the majority of vfio-pci
> > > > > devices are VFs and I don't think we want to present a solution where
> > > > > the requirement is to move the PF driver to userspace.
> > > >
> > > > JasonG recommended the upstream support for VF preservation be limited
> > > > to cases where the PF is also bound to VFIO:
> > > >
> > > > https://lore.kernel.org/lkml/20251003120358.GL3195829@xxxxxxxx/
> > > >
> > > > Within Google we have a way to support in-kernel PF drivers but we are
> > > > trying to focus on simpler use-cases first upstream.
> > > >
> > > > > It's not clear,
> > > > > for example, how we can have vfio-pci variant drivers relying on
> > > > > in-kernel channels to PF drivers to support migration in this model.
> > > >
> > > > Agree this still needs to be fleshed out and designed. I think the
> > > > roadmap will be something like:
> > > >
> > > > 1. Get non-VF preservation working end-to-end (device fully preserved
> > > > and doing DMA continuously during Live Update).
> > > > 2. Extend to support VF preservation where the PF is also bound to vfio-pci.
> > > > 3. (Maybe) Extend to support in-kernel PF drivers.
> > > >
> > > > This series is the first step of #1. I have line of sight to how #2
> > > > could work since it's all VFIO.
> > >
> > > Without 3, does this become a mainstream feature?
> >
> > I do think there will be enough demand for (3) that it will be worth
> > doing. But I also think ordering the steps this way makes sense from
> > an iterative development point of view.
> >
> > > There's obviously a knee jerk reaction that moving PF drivers into
> > > userspace is a means to circumvent the GPL that was evident at LPC,
> > > even if the real reason is "in-kernel is hard".
> > >
> > > Related to that, there's also not much difference between a userspace
> > > driver and an out-of-tree driver when it comes to adding in-kernel code
> > > for their specific support requirements. Therefore, unless migration is
> > > entirely accomplished via a shared dmabuf between PF and VF,
> > > orchestrated through userspace, I'm not sure how we get to migration,
> > > making KHO vs migration a binary choice. I have trouble seeing how
> > > that's a viable intermediate step. Thanks,
> >
> > What do you mean by "migration" in this context?
>
> Live migration support, it's the primary use case currently where we
> have vfio-pci variant drivers on VFs communicating with in-kernel PF
> drivers. Thanks,

I see so you're saying if those users wanted Live Update support and
we didn't do (3), they would have to give up their Live Migration
support. So that would be additional motivation to do (3).

Jason, does this change your mind about whether (3) is worth doing, or
whether it should be prioritized over (2)?

I think I still lean toward doing (2) before (3) since Live Update is
most useful in setups that cannot support Live Migration. If you can
support Live Migration, you have a reasonable way to update host
software with minimal impact to the VM. Live Update really shines in
scenarios where Live Migraiton is untenable, since host upgrades
require VM terminations. In the limit, Live Update can have lower
impact on the VM than Live Migration, since there is no state transfer
across hosts. But Live Migration can enable more maintenance scenarios
than Live Update (like HW maintenance, and firmware upgrades). So I
think both are valuable to support.