Re: [RFC PATCH 2/2] livepatch: Implement livepatch hybrid mode

From: Petr Mladek
Date: Wed Feb 05 2025 - 11:04:42 EST


On Wed 2025-02-05 10:54:47, Yafang Shao wrote:
> On Tue, Feb 4, 2025 at 9:21 PM Petr Mladek <pmladek@xxxxxxxx> wrote:
> >
> > On Mon 2025-01-27 23:34:50, Yafang Shao wrote:
> > > On Mon, Jan 27, 2025 at 10:31 PM Petr Mladek <pmladek@xxxxxxxx> wrote:
> > > >
> > > > On Mon 2025-01-27 14:35:26, Yafang Shao wrote:
> > > > > The atomic replace livepatch mechanism was introduced to handle scenarios
> > > > > where we want to unload a specific livepatch without unloading others.
> > > > > However, its current implementation has significant shortcomings, making
> > > > > it less than ideal in practice. Below are the key downsides:
> > > >
> > > > [...]
> > > >
> > > > > In the hybrid mode:
> > > > >
> > > > > - Specific livepatches can be marked as "non-replaceable" to ensure they
> > > > > remain active and unaffected during replacements.
> > > > >
> > > > > - Other livepatches can be marked as "replaceable," allowing targeted
> > > > > replacements of only those patches.
> > > > >
> > > > > This selective approach would reduce unnecessary transitions, lower the
> > > > > risk of temporary patch loss, and mitigate performance issues during
> > > > > livepatch replacement.
> > > > >
> > > > > --- a/kernel/livepatch/core.c
> > > > > +++ b/kernel/livepatch/core.c
> > > > > @@ -658,6 +658,8 @@ static int klp_add_nops(struct klp_patch *patch)
> > > > > klp_for_each_object(old_patch, old_obj) {
> > > > > int err;
> > > > >
> > > > > + if (!old_patch->replaceable)
> > > > > + continue;
> > > >
> > > > This is one example where things might get very complicated.
> > >
> > > Why does this example even exist in the first place?
> > > If hybrid mode is enabled, this scenario could have been avoided entirely.
> >
>
> You have many questions, but it seems you haven’t taken the time to read even
> a single line of this patchset. I’ll try to address them to save you some time.

What?

> > How exactly this scenario could be avoided, please?
> >
> > In the real life, livepatches are used to fix many bugs.
> > And every bug is fixed by livepatching several functions.
> >
> > Fix1 livepatches: funcA
> > Fix2 livepatches: funcA, funcB
> > Fix3 livepatches: funcB
> >
> > How would you handle this with the hybrid model?
>
> In your scenario, each Fix will replace the applied livepatches, so
> they must be set to replaceable.
> To clarify the hybrid model, I'll illustrate it with the following Fixes:
>
> Fix1 livepatches: funcA
> Fix2 livepatches: funcC
> Fix3 livepatches: funcA, funcB

How does look the livepatched FuncA here?
Does it contain changes only for Fix3?
Or is it cummulative livepatches_funA includes both Fix1 + Fix3?

> Fix4 livepatches: funcD
> Fix5 livepatches: funcB
>
> Each FixN represents a single /sys/kernel/livepatch/FixN.
> By default, all Fixes are set to non-replaceable.
>
> Step-by-step process:
> 1. Loading Fix1
> Loaded Fixes: Fix1
> 2. Loading Fix2
> Loaded Fixes: Fix1 Fix2
> 3. Loading Fix3
> Before loading, set Fix1 to replaceable and replace it with Fix3,
> which is a cumulative patch of Fix1 and Fix3.

Who will modify the "replaceable" flag of "Fix1"? Userspace or kernel?

How the livepatch modules would get installed/removed to/from the
filesystem?

We (SUSE) distribute the livepatches using RPM packages. Every new
version of the livepatch module is packaged in a RPM package with
the same name and higher version. A post install script loads
the module into the kernel and removes disabled livepatch modules.

The package update causes that the new version of the livepatch module
is enabled and the old version is removed. And also the old version
of the module and is removed from the filesystem together with the old
version of the RPM package.

This matches the behavior of the atomic replace. There is always only
one version of the livepatch RPM package installed and only one
livepatch module loaded/enabled. And when it works correcly then
the version of the installed package matches the version of the loaded
livepatch module.

This might be hard to achieve with the hybrid model. Every livepatch
module would need to be packaged in a separate (different name)
RPM package. And some userspace service would need to clean up both
kernel modules and livepatch RPM packages (remove the unused ones).

This might add a lot of extra complexity.

> Loaded Fixes: Fix2 Fix3
> 4. Loading Fix4
> Loaded Fixes: Fix2 Fix3 Fix4
> 5. Loading Fix5
> Similar to Step 3, set Fix3 to replaceable and replace it with Fix5.
> Loaded Fixes: Fix2 Fix4 Fix5

Let's imagine another situation:

Fix1 livepatches: funcA, funcB
Fix2 livepatches: funcB, funcC
Fix3 livepatches: funcA, funcC

Variant A:

1. Loading Fix1
Loaded Fixes: Fix1
In use:: funcA_fix1, funcB_fix1

2. Loading Fix2
Loaded Fixes: Fix1 Fix2
In use: funcA_fix1, funcB_fix2, funcC_fix2

3. Loading Fix3
Loaded Fixes: Fix2 Fix3
In use: funcA_fix3, funcB_fix2, funcC_fix3

This will be correct only when:

+ funcA_fix3 contains both changes from Fix1 and Fix3
+ funcC_fix3 contains both changes from Fix2 and Fix3


Variant B:

1. Loading Fix1
Loaded Fixes: Fix1
In use:: funcA_fix1, funcB_fix1

2. Loading Fix2 (fails from some reason or is skipped)
Loaded Fixes: Fix1
In use:: funcA_fix1, funcB_fix1

3. Loading Fix3
Loaded Fixes: Fix1 Fix2
In use: funcA_fix3, funcB_fix1, funcC_fix3

This will be correct only when:

+ funcA_fix3 contains both changes from Fix1 and Fix3
and stays funcB_fix1 is compatible with funcA_fix3
+ funcC_fix3 contains changes only from Fix3,
it must be compatible with the original funcB because

I want to say that this would work only when all livepatches
are loaded in the right order. Otherwise, it might break
the system.

How do you want to ensure this?

Is it really that easy?

> 3. Loading Fix3
> Before loading, set Fix1 to replaceable and replace it with Fix3,

> This hybrid model ensures that irrelevant Fixes remain unaffected
> during replacements.

It makes some some now. But IMHO, it is not as easy as it looks.
The complexity is in details.

> >
> > Which fix will be handled by livepatches installed in parallel?
> > Which fix will be handled by atomic replace?
> > How would you keep it consistent?
> >
> > How would it work when there are hundreds of fixes and thousands
> > of livepatched functions?
>
> The key point is that if a new Fix modifies a function already changed
> by previous Fixes, all the affected old Fixes should be set to
> replaceable, merged into the new Fix, and then the old Fixes should be
> replaced with the new one.

As I tried to explain above. This might be hard to use in practice.

We would either need to enforce loading all livepatches in the right
order. It might be hard to make it user friendly.

Or it would need a lot of extra code which would ensure that only
compatible livepatches can be loaded.

Best Regards,
Petr