Re: [PATCH v3 2/2] liveupdate: initialize incoming FLB state before finish

From: Pratyush Yadav

Date: Thu Apr 02 2026 - 09:38:13 EST

On Thu, Mar 26 2026, Leo Timmins wrote:

> luo_flb_file_finish_one() decremented incoming.count before making sure
> that the incoming FLB state had been materialized. If no earlier incoming
> retrieval had populated that state, the first decrement ran from zero and
> skipped the last-user finish path.
>
> Initialize the incoming FLB state before the first decrement so finish
> uses the serialized refcount instead of an uninitialized value.

This commit message makes it very hard to understand what the problem it
fixes. It took me 20 minutes of reading this patch and looking at the
FLB code to figure out what is going on. Here is what I'd suggest:

The state of an incoming FLB object is initialized when it is first
used. The initialization is done via luo_flb_retrieve_one(), which looks
at all the incoming FLBs, matches the FLB to its serialized entry, and
initializes the incoming data and count.

luo_flb_file_finish_one() is called when finish is called for a file
registered with this FLB. If no file handler has used the FLB by this
point, the count stays un-initialized at 0. luo_flb_file_finish_one()
then decrements this un-initialized count, leading to an underflow. This
results in the FLB finish never being called since the count has
underflowed to a very large value.

Fix this by making sure the FLB is retrieved before using its count.

>
> Fixes: cab056f2aae7 ("liveupdate: luo_flb: introduce File-Lifecycle-Bound global state")
> Reviewed-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx>
> Signed-off-by: Leo Timmins <leotimmins1974@xxxxxxxxx>
> ---
> kernel/liveupdate/luo_flb.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/liveupdate/luo_flb.c b/kernel/liveupdate/luo_flb.c
> index f52e8114837e..855af655b09b 100644
> --- a/kernel/liveupdate/luo_flb.c
> +++ b/kernel/liveupdate/luo_flb.c
> @@ -192,10 +192,27 @@ static int luo_flb_retrieve_one(struct liveupdate_flb *flb)
> static void luo_flb_file_finish_one(struct liveupdate_flb *flb)
> {
> struct luo_flb_private *private = luo_flb_get_private(flb);
> + bool needs_retrieve = false;
> u64 count;
>
> - scoped_guard(mutex, &private->incoming.lock)
> + scoped_guard(mutex, &private->incoming.lock) {
> + if (!private->incoming.count && !private->incoming.finished)
> + needs_retrieve = true;
> + }
> +
> + if (needs_retrieve) {
> + int err = luo_flb_retrieve_one(flb);
> +
> + if (err) {
> + pr_warn("Failed to retrieve FLB '%s' during finish: %pe\n",
> + flb->compatible, ERR_PTR(err));
> + return;
> + }
> + }
> +
> + scoped_guard(mutex, &private->incoming.lock) {
> count = --private->incoming.count;
> + }

This looks way too overcomplicated. We just need to move the retrieve
call before we check the count.

Pasha, side note: I think FLB suffers from the same problem I fixed for
LUO files with f85b1c6af5bc ("liveupdate: luo_file: remember retrieve()
status"). If retrieve fails, it doesn't remember the error code and will
retry retrieve() on next usage, causing all sorts of problems.

Pasha, another side note: I think incoming.private should be initialized
when the FLB is registered, not when it is first used. This will make
this simpler overall. This would need liveupdate_register_flb() to not
call kzalloc(), but that can be done I think.

Anyway, for this problem, how about the patch below (only
compile-tested) addressing my comments. The diff looks bigger but the
end result is a lot cleaner IMO. In effect it just moves the retrieve
call outside the if (!count). Everything else is just cleaning up the
locking situation.

--- 8< ---