Re: [PATCH v10 05/10] livepatch: Support separate list for replaced patches.

From: Josh Poimboeuf
Date: Mon Apr 16 2018 - 15:04:32 EST


On Mon, Apr 16, 2018 at 04:58:11PM +0200, Petr Mladek wrote:
> On Wed 2018-04-11 10:48:52, Josh Poimboeuf wrote:
> > On Wed, Apr 11, 2018 at 04:17:11PM +0200, Petr Mladek wrote:
> > > > I still agree with my original conclusion that enforcing stack order no
> > > > longer makes sense though.
> > >
> > > The question is what we will get if we remove the stack. Will it
> > > really make the code easier and livepatching more safe?
> > >
> > > First, note that stack is the main trick used by the ftrace handler.
> > > It gets the patch from top of the stack and use the new_addr from
> > > that patch.
> > >
> > > If we remove the stack, we still need to handle 3 possibilities.
> > > The handler will need to continue either with original_code,
> > > old_patch or new_patch.
> > >
> > > Now, just imagine the code. We will need variables orig_addr,
> > > old_addr, new_addr or so which might be confusing. It will be
> > > even more confusion if we do revert/disable. Also new_addr will
> > > become old_addr if we install yet another patch.
> > >
> > > We had exactly this in kGraft and it was a mess. I said "wow,
> > > that is genius" when I saw the stack approach in the upstream
> > > code proposal.
> >
> > You're confusing the func stack with the patch stack.
> >
> > My proposal is to get rid of the patch stack.
>
> The are related from my POV. The patches have the same ordering
> in both patch and function stacks. The patch ordering is checked
> when the patches are enabled and disabled.

True.

> > We can keep the func stack. It will be needed anyway for the 'replace'
> > case, where the stack may be of size 2.
>
> OK, you want to keep the func stack because it is useful from the
> implementation point of view.
>
>
> > > Second, unrelated patches must never patch the same functions.
> > > Otherwise we would not be able to define which implementation
> > > should be used. This is especially important when a patch is
> > > removed and we need to fallback either to another patch or
> > > original code. Yes, it makes perfect sense. But it needs code
> > > that will check it, refuse loading the patch, ... It is not
> > > complicated. But it is rather additional code than
> > > simplification. I might make livepatching more safe
> > > but probably not simplify the code.
> >
> > We don't need to enforce that. The func stack can stay. If somebody
> > wants to patch the same function multiple times (without using
> > 'replace'), that's inadvisable, but it's also their business. They're
> > responsible for the tooling to ensure the patch stack order is sane.
>
>
> While it might make sense to ignore the patch stack (ordering) for
> the enable operation. Do we really want to ignore it when disabling
> a patch.
>
> By other words, do we want to allow disabling a patch that is in
> the middle of the stack, only partly in use? Does not this allow
> some other crazy scenarios? Is it really the user business?
> Will it make our life easier?

If there's no longer a patch stack, then there's no concept of a middle.
We would expect the patches to be independent of one another, and so
disabling any of them independently would be harmless.

If any of the patches share a func, and the user disables one in the
"middle", it's not our job to support that. The vendor / patch author
should prevent such cases from occurring with tooling, packaging,
documentation, etc. Or they can just use 'replace'.

We can already have similar unexpected situations today. For example,
what if patch B is a cumulative superset of patch A, but the user
mistakenly loads patch A (without replace) *after* loading patch B?
Then some unforeseen craziness could ensue.

We can't control all such scenarios (and that's ok), but we shouldn't
pretend that we support them.

> This will not happen if we refuse to load non-replace patches
> that touch an already patches fucntion. Then the patch stack
> might become only implementation detail. It will not define
> the ordering any longer.

I think this would only be a partial solution. Patches can have
implicit interdependencies, even if they don't patch the same function.
Also it doesn't solve the problem when patches are loaded in the wrong
order. We have to trust vendors and admins to do the right thing.

> > > > > > > Another possibility would be to get rid of the enable/disable states.
> > > > > > > I mean that the patch will be automatically enabled during
> > > > > > > registration and removed during unregistration.
> > > > > >
> > > > > > I don't see how disabling during unregistration would be possible, since
> > > > > > the unregister is called from the patch module exit function, which
> > > > > > can only be called *after* the patch is disabled.
> > > > > >
> > > > > > However, we could unregister immediately after disabling (i.e., in
> > > > > > enabled_store context).
> > > > >
> > > > > I think this is what Petr meant. So there would be nothing in the patch
> > > > > module exit function. Well, not exactly. We'd need to remove sysfs dir and
> > > > > maybe something more.
> > > >
> > > > Sounds good to me, though aren't the livepatch sysfs entries removed by
> > > > klp during unregister?
> > >
> > > This is why I asked in my earlier mail if we need to keep sysfs
> > > entries for unused patches.
> >
> > If they are permanently disabled then I think the sysfs entries should
> > be removed.
> >
> > > We could remove them when the patch gets disabled (transition
> > > finishes). Then we do not need to do anything in module_exit().
> >
> > Agreed, though what happens if the transition finishes while still
> > running in the context of the write to the sysfs entry? Can we remove a
> > sysfs entry while executing code associated with it?
>
> I would need to test it to be sure. But I believe that it will be
> possible to remove sysfs entry from its own callback. All this code
> heavily uses reference counters and callbacks called when the count
> reaches zero.
>
> OK. What about the following solution?
>
> + Enable patch directly from klp_register_patch()
> + Unregister the patch directly from klp_complete_transition()
> when the patch is disabled.
> + Put the module later when all sysfs entries are removed
> in klp_unregister_patch().
>
> As a result:
>
> + module_init() will need to call only klp_register_patch()
> + module_exit() will do nothing
> + the module can be removed only when it is not longer needed

Sounds good to me.

> Some other ideas:
>
> + rename /sys/kernel/livepatch/<patch>/enable -> unregister
> allow to write into /sys/kernel/livepatch/<patch>/transition
>
> + echo 1 >unregistrer to disable&unregister the patch
> + echo 0 >transition to revert the running transition

Why not keep the existing sysfs interfaces? So

echo 0 > enable

would disable (and eventually unregister) the patch.

> The question is what to do with the stack of patches. It will have
> no meaning for the enable operation because it will be done
> automatically. But what about the disable/unregistrer operation?

Assuming we got rid of the patch stack, would we even need to keep a
global list of patches anymore?

> Anyway, this looks like a revolution. Do we want to do all these
> changes, including atomic replace, in a single patchset?

At least for the bits which affect external tooling interfaces, it would
be nice to group them all together.

--
Josh