Re: [PATCH v6 03/12] PCI: liveupdate: Track incoming preserved PCI devices

From: David Matlack

Date: Mon Jun 15 2026 - 16:29:19 EST


On 2026-06-14 01:38 PM, Pasha Tatashin wrote:
> On Fri, 22 May 2026 20:24:01 +0000, David Matlack <dmatlack@xxxxxxxxxx> wrote:
> > diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> > index 10c9b65aa242..e68ae5c172d4 100644
> > --- a/drivers/pci/Kconfig
> > +++ b/drivers/pci/Kconfig
> > @@ -330,7 +330,7 @@ config VGA_ARB_MAX_GPUS
> >
> > config PCI_LIVEUPDATE
> > bool "PCI Live Update Support"
> > - depends on PCI && LIVEUPDATE
> > + depends on PCI && LIVEUPDATE && 64BIT
>
> Please move this to the first patch, fewer changes between patches, and
> also KHO does not support anything but 64-bit mode.

I think it is preferred to introduce changes when they are needed. This
is the exact patch that requires 64BIT so it makes sense to add the
dependency within this patch no?

>
> >
> > diff --git a/drivers/pci/liveupdate.c b/drivers/pci/liveupdate.c
> > index 065d5af822f7..96c43b84532c 100644
> > --- a/drivers/pci/liveupdate.c
> > +++ b/drivers/pci/liveupdate.c
> > @@ -128,13 +157,49 @@ static void pci_flb_unpreserve(struct liveupdate_flb_op_args *args)
> > [ ... skip 31 lines ... ]
> > +
> > +err_xa_destroy:
> > + xa_destroy(&incoming->xa);
> > + kfree(incoming);
> > +err_restore_free:
> > + kho_restore_free(ser);
>
> This is the pattern we have been enforcing in other places in LUO. If
> the first retrieval fails, return the same error thereafter.
>
> > @@ -270,6 +335,91 @@ void pci_liveupdate_unpreserve(struct pci_dev *dev)
> > }
> > EXPORT_SYMBOL_GPL(pci_liveupdate_unpreserve);
> >
> > +static struct pci_flb_incoming *pci_liveupdate_flb_get_incoming(void)
> > +{
> > + struct pci_flb_incoming *incoming = NULL;
> > + int ret;
>
> Maybe make the error return static, and avoid another search through compatible
> FLBs if it failed before?

Good idea, will do.

>
> 1. Add "saved_err;"; if it is set, return it right away.
> 2. Change all errors to use goto save_err;, and at the end of the
> function, assign ret to saved_err;
>
> > [ ... skip 15 lines ... ]
> > + * This could mean that no PCI FLB data was passed by the previous
> > + * kernel, but it could also mean the previous kernel used a different
> > + * compatibility string (i.e. a different ABI).
> > + */
> > + if (ret == -ENOENT) {
> > + pr_info_once("No incoming FLB matched %s\n", pci_liveupdate_flb.compatible);
>
> I would assume this is very normal, e.g., no devices were preserved but
> memfd+hugetlb was preserved. Maybe use pr_debug_once().

The problem is described in the comment above. The PCI core cannot
distinguish between:

1. There was not PCI FLB preserved. and
2. There was PCI FLB preserved, but it was not compatible.

I agree we should not log anything for (1) but (2) definitely requires
at least some kind of logging so that's why I went with info-level here
and not debug.

>
> > + return NULL;
> > + }
> > +
> > + /*
> > + * There is incoming FLB data that matches pci_liveupdate_flb.compatible
> > + * but it cannot be retrieved.
> > + */
> > + if (ret) {
> > + WARN_ONCE(ret, "Failed to retrieve incoming FLB data\n");
>
> No need to print backtrace, please just print a warning:
> pr_warn_once("Failed to retrieve incoming FLB data: %pe\n", ERR_PTR(ret));

SGTM

>
> > [ ... skip 34 lines ... ]
> > + * through pci_liveupdate_finish(). This can happen if PCI core probes
> > + * the same device multiple times, e.g. due to hotplug.
> > + */
> > + if (!dev_ser->refcount) {
> > + pci_liveupdate_flb_put_incoming();
> > + return;
>
> Pleaes use 'goto put_incoming'

SGTM.

>
> > + }
> > +
> > + pci_info(dev, "Device was preserved by previous kernel across Live Update\n");
> > + dev->liveupdate.incoming = dev_ser;
> > +
> > + /*
> > + * Hold the ref on the incoming FLB until pci_liveupdate_finish() so
> > + * that dev->liveupdate.incoming does not get freed while it is in use.
> > + */
>
> How would that work? If finish is not called FLB stays around until the
> next reboot.

True... I think if the PCI core trusts drivers to call
pci_liveupdate_finish() then we don't need to hold onto the incoming
reference here.

>
> --
> Pasha Tatashin <pasha.tatashin@xxxxxxxxxx>