Re: [RFC PATCH 3/4] livepatch: Add "replaceable" attribute to klp_patch
From: Petr Mladek
Date: Tue Apr 07 2026 - 09:53:55 EST
On Mon 2026-04-06 19:08:05, Yafang Shao wrote:
> On Sat, Apr 4, 2026 at 5:36 AM Song Liu <song@xxxxxxxxxx> wrote:
> >
> > On Fri, Apr 3, 2026 at 1:55 PM Dylan Hatch <dylanbhatch@xxxxxxxxxx> wrote:
> > [...]
> > > > IIRC, the use case for this change is when multiple users load various
> > > > livepatch modules on the same system. I still don't believe this is the
> > > > right way to manage livepatches. That said, I won't really NACK this
> > > > if other folks think this is a useful option.
> > >
> > > In our production fleet, we apply exactly one cumulative livepatch
> > > module, and we use per-kernel build "livepatch release" branches to
> > > track the contents of these cumulative livepatches. This model has
> > > worked relatively well for us, but there are some painpoints.
> > >
> > > We are often under pressure to selectively deploy a livepatch fix to
> > > certain subpopulations of production. If the subpopulation is running
> > > the same build of everything else, this would require us to introduce
> > > another branching factor to the "livepatch release" branches --
> > > something we do not support due to the added toil and complexity.
> > >
> > > However, if we had the ability to build "off-band" livepatch modules
> > > that were marked as non-replaceable, we could support these selective
> > > patches without the additional branching factor. I will have to
> > > circulate the idea internally, but to me this seems like a very useful
> > > option to have in certain cases.
> >
> > IIUC, the plan is:
> >
> > - The regular livepatches are cumulative, have the replace flag; and
> > are replaceable.
> > - The occasional "off-band" livepatches do not have the replace flag,
> > and are not replaceable.
> >
> > With this setup, for systems with off-band livepatches loaded, we can
> > still release a cumulative livepatch to replace the previous cumulative
> > livepatch. Is this the expected use case?
>
> That matches our expected use case.
>
> >
> > I think there are a few issues with this:
> > 1. The "off-band" livepatches cannot be replaced atomically. To upgrade
> > "off-band' livepatches, we will have to unload the old version and load
> > the new version later.
>
> Right. That is how the non-atomic-replace patch works.
>
> > 2. Any conflict with the off-band livepatches and regular livepatches will
> > be difficult to manage.
>
> We need to manage this conflict with a complex user script. That said,
> everything can be controlled from userspace.
>
> > IOW, we kind removed the benefit of cumulative
> > livepatches. For example, what shall we do if we really need two fixes
> > to the same kernel functions: one from the original branch, the other
> > from the off-band branch?
>
> We run tens of livepatches on our production servers and have never
> run into this issue. It's an extremely rare case — and if it does
> happen, a user script should be able to handle it just fine.
Could you please share the script? Or at least summarize the situations
when this script detect a conflict and refuse loading a livepatch?
I believe that most/all of these checks can be implemented in the kernel.
And if we agreed to add a hybrid mode than it should be added
together with the checks.
We have already invested a lot of effort into make the kernel
livepatching as safe as possible. From my POV, the most important
parts are:
+ consistency model: Tasks are transitioned separately when they
do not use any livepatched function.
+ atomic replace: Transition all livepatched functions at once.
If we agree to add the hybrid model then we should add it with
some safety belts as well. And it would be nice to get inspiration
about the safety checks from your script.
Best Regards,
Petr