Re: [PATCH v6 02/20] liveupdate: luo_core: integrate with KHO

From: Pasha Tatashin
Date: Mon Nov 17 2025 - 13:30:47 EST


On Sun, Nov 16, 2025 at 2:16 PM Mike Rapoport <rppt@xxxxxxxxxx> wrote:
>
> On Sun, Nov 16, 2025 at 09:55:30AM -0500, Pasha Tatashin wrote:
> > On Sun, Nov 16, 2025 at 7:43 AM Mike Rapoport <rppt@xxxxxxxxxx> wrote:
> > >
> > > > +static int __init liveupdate_early_init(void)
> > > > +{
> > > > + int err;
> > > > +
> > > > + err = luo_early_startup();
> > > > + if (err) {
> > > > + pr_err("The incoming tree failed to initialize properly [%pe], disabling live update\n",
> > > > + ERR_PTR(err));
> > >
> > > How do we report this to the userspace?
> > > I think the decision what to do in this case belongs there. Even if it's
> > > down to choosing between plain kexec and full reboot, it's still a policy
> > > that should be implemented in userspace.
> >
> > I agree that policy belongs in userspace, and that is how we designed
> > it. In this specific failure case (ABI mismatch or corrupt FDT), the
> > preserved state is unrecoverable by the kernel. We cannot parse the
> > incoming data, so we cannot offer it to userspace.
> >
> > We report this state by not registering the /dev/liveupdate device.
> > When the userspace agent attempts to initialize, it receives ENOENT.
> > At that point, the agent exercises its policy:
> >
> > - Check dmesg for the specific error and report the failure to the
> > fleet control plane.
>
> Hmm, this is not nice. I think we still should register /dev/liveupdate and
> let userspace discover this error via /dev/liveupdate ABIs.

Not registering the device is the correct approach here for two reasons:

1. This follows the standard Linux driver pattern. If a driver fails
to initialize its underlying resources (hardware, firmware, or in this
case, the incoming FDT), it does not register a character device.
2. Registering a "zombie" device that exists solely to return errors
adds significant complexity. We would need to introduce a specific
"broken" state to the state machine and add checks to IOCTLs to reject
commands with a specific error code.

Pasha