Re: [PATCH v2 0/5] liveupdate: validate restored LUO metadata

From: Pasha Tatashin

Date: Wed May 06 2026 - 12:37:40 EST

On 05-06 18:15, Pratyush Yadav wrote:
> On Wed, May 06 2026, Pasha Tatashin wrote:
>
> > On 05-06 11:02, Pratyush Yadav wrote:
> >> Hi Pasha,
> >>
> >> On Fri, May 01 2026, Pasha Tatashin wrote:
> >>
> >> > On 05-02 01:30, Cris Jacob Maamor wrote:
> >> >> LUO restores metadata from KHO/FDT during liveupdate. The restored
> >> >> metadata contains physical addresses and count fields used to access and
> >> >> walk preserved session, file set, and FLB arrays.
> >> >>
> >> >> This series adds a non-consuming KHO preserved-range check and uses it
> >> >> before phys_to_virt() on restored metadata addresses. It also rejects
> >> >> restored counts above LUO_SESSION_MAX, LUO_FILE_MAX, and LUO_FLB_MAX
> >> >> before traversal.
> >> >>
> >> >> As far as I can tell, this is root/admin-only; I do not have evidence
> >> >> that a normal unprivileged user can trigger it directly.
> >> >>
> >> >> Changes since v1:
> >> >> - Dropped RFC marking.
> >> >> - Added changelog text to each patch.
> >> >> - No code changes.
> >> >>
> >> >> Cris Jacob Maamor (5):
> >> >> kexec: handover: add helper to check preserved page ranges
> >> >> liveupdate: validate LUO FDT physical address before mapping
> >> >> liveupdate: validate restored LUO session metadata
> >> >> liveupdate: validate restored LUO file set metadata
> >> >> liveupdate: validate restored LUO FLB metadata
> >> >
> >> > I have replied separately in the security report to clarify that this is
> >> > not a bug. The behavior follows the ABI specification exactly: we use
> >> > the PA addresses and ranges provided by the KHO FDT tree.
> >> >
> >> > NAK
> >>
> >> I really do think we should do a restore-only variant for the
> >> kho_alloc_preserve() family of allocators and use it everywhere. It
> >
> > That is unrelated to the provided patch series. The author of this
> > series reported this as a security issue to the Linux security ML, and
> > submitted this series at their request.
>
> Oh yes, sure. I am not arguing for taking this series. I just figured
> this would be a good point to have this discussion.
>
> >
> > This is not a security issue, and in fact, it is not an issue at all. A
> > restore-only variant can be added, but I do not see a reason for LUO to
> > use it.
> >
> >> would prevent problems in the future. Not because the previous kernel is
> >> malicious, but because we might have bugs and the KHO page magic sanity
> >> check acts as a defense in depth.
> >>
> >> For example, I am currently looking at a LUO bug where LUO does not
> >> track if a session is outgoing or incoming. So you can do a retrieve()
> >> or finish() on an outgoing session. A lot of nastiness is saved because
> >> of the page magic check. Things like kho_restore_vmalloc() or
> >> kho_restore_folio() fail early and loudly.
> >
> > I am not sure what bug you are looking at (please share the details!),
>
> I was looking at LUO code and realized that we do not separate outgoing
> and incoming sessions when dealing with preserve/retrieve/finish ioctls.
> So you can create a session, preserve a FD, and then immediately call
> finish or retrieve without doing a kexec. Of course, LUO file handlers
> aren't able to cope with it.

Oh, this makes sense, please add a self-test for that as well :-)

>
> So for example, you can preserve a memfd and then immediately call
> finish. This will call memfd_luo_finish(), where it will try to
> kho_restore_vmalloc(). That fails with a bit WARN splat. And then later
> it calls kho_restore_free() which also fails in a similar fashion.
>
> You can do the same thing with retrieve(), but that also fails early and
> loudly and does not cause any problems.
>
> I am working on a fix for it. Should have something out shortly.
>
> > but the fix absolutely should be to use outgoing/incoming sessions
> > properly, and if we mixed them up somewhere, THAT should be fixed. Using
> > KHO restore is not going to help much; however, I agree it can add
> > some extra scrutiny (i.e., similar to an ASSERT), but it is not really
> > something that would help improve correctness in any meaningful way. The
> > correctness should lie in the LUO logic using incoming as incoming, and
> > outgoing as outgoing.
>
> I am not arguing that we shouldn't fix the logic bugs. Of course we
> should.
>
> My point is that this sanity check acts as another layer of defence.
> Bugs happen, but the earlier we catch them the better and this sanity
> check helps us do exactly that.
>
> For example, if we did not have these sanity checks, the loud errors I
> described above would be replaced by silent use-after-free, double-free,
> struct page corruption, or other problems.
>
> So I would like to understand why you _don't_ want to have this line of
> defence. What's the problem? If you are worried about performance, we
> can go and measure it. If the overhead is too high this can be behind a
> debug config.

Most likely, there is no performance cost, because when we free
preserved memory, we still need to do a KHO restore. The only difference
is that it may occur after a blackout not during blackout. Anyway, if
you would like to add this sanity check, please send it out, and we can
review and discuss how it looks.

>
> >
> >>
> >> If we want to squeeze out more performance later down the line we can
> >> move it behind a debug config, but having this usage pattern of always
> >> restoring before using is going to be a lot more sane than just using
> >> physical addresses willy nilly.
> >>
> >> The approach this series takes with kho_is_preserved() is the wrong
> >> design. But a kho_restore() or something similar (maybe we can find a
> >> better name?) is really where we should be going.
> >>
> >> --
> >> Regards,
> >> Pratyush Yadav
>
> --
> Regards,
> Pratyush Yadav