Re: [PATCH v2] liveupdate: luo_file: remember retrieve() status
From: Andrew Morton
Date: Mon Feb 16 2026 - 16:44:17 EST
On Mon, 16 Feb 2026 14:22:19 +0100 Pratyush Yadav <pratyush@xxxxxxxxxx> wrote:
> From: "Pratyush Yadav (Google)" <pratyush@xxxxxxxxxx>
>
> LUO keeps track of successful retrieve attempts on a LUO file. It does
> so to avoid multiple retrievals of the same file. Multiple retrievals
> cause problems because once the file is retrieved, the serialized data
> structures are likely freed and the file is likely in a very different
> state from what the code expects.
>
> The retrieve boolean in struct luo_file keeps track of this, and is
> passed to the finish callback so it knows what work was already done and
> what it has left to do.
>
> All this works well when retrieve succeeds. When it fails,
> luo_retrieve_file() returns the error immediately, without ever storing
> anywhere that a retrieve was attempted or what its error code was. This
> results in an errored LIVEUPDATE_SESSION_RETRIEVE_FD ioctl to userspace,
> but nothing prevents it from trying this again.
>
> The retry is problematic for much of the same reasons listed above. The
> file is likely in a very different state than what the retrieve logic
> normally expects, and it might even have freed some serialization data
> structures. Attempting to access them or free them again is going to
> break things.
>
> For example, if memfd managed to restore 8 of its 10 folios, but fails
> on the 9th, a subsequent retrieve attempt will try to call
> kho_restore_folio() on the first folio again, and that will fail with a
> warning since it is an invalid operation.
>
> Apart from the retry, finish() also breaks. Since on failure the
> retrieved bool in luo_file is never touched, the finish() call on
> session close will tell the file handler that retrieve was never
> attempted, and it will try to access or free the data structures that
> might not exist, much in the same way as the retry attempt.
>
> There is no sane way of attempting the retrieve again. Remember the
> error retrieve returned and directly return it on a retry. Also pass
> this status code to finish() so it can make the right decision on the
> work it needs to do.
>
> This is done by changing the bool to an integer. A value of 0 means
> retrieve was never attempted, a positive value means it succeeded, and a
> negative value means it failed and the error code is the value.
>
> Fixes: 7c722a7f44e0 ("liveupdate: luo_file: implement file systems callbacks")
Should we backport this into 6.19.1?