Re: [PATCH 1/1] liveupdate: luo_file: Add internal APIs for file preservation

From: Pasha Tatashin

Date: Mon Jun 29 2026 - 03:40:55 EST

On 06-26 13:57, Pratyush Yadav wrote:
> Hi Sami,
>
> On Sat, Jun 13 2026, Samiullah Khawaja wrote:
>
> > From: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx>
> >
> > Live update orchestrator file handlers depend on the preservation of
> > other files. To make sure that the dependency is preserved, the file
> > handlers needs to fetch the preservation token of the preserved
> > dependency. Similarly during restore, a file handler wants to fetch the
> > restored file of the dependency.
> >
> > Add APIs that allows fetching token of dependency during preservation,
> > and fetching the restored file dependency during restore.
> >
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx>
> > Signed-off-by: Samiullah Khawaja <skhawaja@xxxxxxxxxx>
>
> We discussed this once already on a call, but I'll write my argument out
> here for everyone else to get a say as well.
>
> While it isn't obvious, this patch implicitly defines a part of the uAPI
> for live update. This patch says to VMMs (or other live update users)
> that "you can restore dependent files in any order". That is, VMMs
> don't have to restore the files in a topological sort order or
> dependencies, they can do so in any order and the kernel will manage the
> dependencies on its own.

Avoiding a forced dependency ordering is a deliberate design choice in
LUO, to avoid any kind of circular dependeces: A depends on B, B depends
on C, and C depends on A.

To achieve this, LUO provides the .can_finish() callback. So, LUO does
two-phase verification:

1. It iterates through all tracked files and invokes .can_finish().
2. Only if *all* files return success does it proceed to invoke .finish().

If a VMM restores a file (such as guest_memfd) but fails to restore its
dependency (such as the VM FD), or attempts to close the session
prematurely, the .can_finish() check for that file will fail (returning
-EBUSY), and the entire finish sequence will abort. This guarantees
kernel-enforced correctness at the session boundary and without forcing
the VMM to execute restores in a strict sequential order, which anway
would not make any sense from kernel side due to circular dependecies
issue, where topological sort does not exist.

>
> But on the preservation side, VMMs still do need to follow the
> topological order of dependencies. Because if they don't, the
> liveupdate_get_token_outgoing() call will fail and preservation can't
> proceed.

Actually, preservation can also be performed in an order-independent manner.
While a handler can call liveupdate_get_token_outgoing() during .preserve(),
it can also defer this query until the .freeze() callback. Because .freeze()
is invoked after all files in the session have completed their .preserve() phase,
all dependency tokens are guaranteed to be available, completely eliminating any
topological ordering requirements during the initial preservation calls. It is
up to individual file handler implementations to decide whether they wish to
enforce ordering at .preserve() time or defer it to .freeze().

> In simple words, if file type A depends on file type B, VMMs always need
> to preserve B before A, because A's preservation will try to find B's
> token, and if B is not preserved that will fail. On the _restore_ side
> though, liveupdate_get_file_incoming() implicitly retrieves the file so
> the VMM can restore then in any order.
>
> I don't like this for a couple reasons. First, this makes the API
> asymmetric. If the VMM needs to manage dependency order during
> preservation anyway, why not do it on retrieve as well?
>
> Second, the API is easier to misuse. The VMM can restore A but not B,
> and then close the session. It will go on its merry way never knowing it
> did something wrong. For example, guest_memfd depends on its VM FD. With
> this patch, LUO will allow restoring guest_memfd without restoring the
> VM FD. This makes the guest_memfd practically useless. Yes, it is a bug
> in the VMM anyway, but if guest_memfd restore was denied, then it would
> be easier to catch.
>
> The kernel will keep itself safe in either case, but it will make the
> API harder to misuse. And you can always _relax_ the ordering
> requirement if there is a need in the future, but you can't go the other
> way round.
>
> So that's my question: do we enforce restore ordering? The code change
> should be relatively simple. You just need to fail if the file is not
> already restored in liveupdate_get_file_incoming().
>
> In either case, please at least add a piece in the documentation about
> this ordering. We should not leave it implicit.

As explained above, the .can_finish() callback addresses this problem
and prevents any misuse (such as closing a session with a missing VM FD
dependency).

That said, I agree that these ordering semantics, deferred verification
model, and the exact roles of .can_finish() and .freeze() should not
remain implicit. It makes sense adding details to the documentation to
clarify this behavior.

Pasha