Re: [PATCH v12 06/12] livepatch: Simplify API by removing registration step
From: Josh Poimboeuf
Date: Tue Oct 23 2018 - 12:39:52 EST
On Mon, Oct 22, 2018 at 03:25:10PM +0200, Petr Mladek wrote:
> On Fri 2018-10-19 09:36:04, Josh Poimboeuf wrote:
> > On Fri, Oct 19, 2018 at 02:16:19PM +0200, Miroslav Benes wrote:
> > > On Thu, 18 Oct 2018, Josh Poimboeuf wrote:
> > >
> > > > On Thu, Oct 18, 2018 at 04:54:56PM +0200, Petr Mladek wrote:
> > > > > OK, what about having just "disable" in sysfs. I agree that it makes
> > > > > much more sense than "enable" now.
> > > > >
> > > > > It might be used also for the reverse operation the same way as
> > > > > "enable" was used before. I think that standalone "reverse" might
> > > > > be confusing when we allow to reverse the operation in both
> > > > > directions.
> > > >
> > > > As long as we're talking about radical changes... how about we just
> > > > don't allow disabling patches at all? Instead a patch can be replaced
> > > > with a 'revert' patch, or an empty 'nop' patch. That would make our
> > > > code simpler and also ensure there's an audit trail.
> > > >
> > > > (Apologies if we've already talked about this. My brain is still mushy
> > > > thanks to Spectre and friends.)
> > >
> > > I think we talked about it last year in Prague and I think we convinced
> > > you that it was not a good idea (...not to allow disabling patches at
> > > all).
> > >
> > > BUT! Empty 'nop' patch is a new idea and we may certainly discuss it.
> >
> > I definitely remember talking about it in Prague, but I don't remember
> > any conclusions.
>
> The revert operation allows to remove a livepatch stuck in the
> transition without forcing.
True, though I question the real world value of that.
> Also implementing empty cumulative patch might be tricky because
> of the callbacks. The current proposal is to call callbacks only
> from the new livepatch. It helps tp keep the interactions easy
> and under control. The way how to take over some change between
> an old and new patch depends on the particular functionality.
Presumably a 'no-op' patch would be special, in that it would call the
un-patch callbacks.
I think the only *real* benefit of this proposal would be that history
would be a straight line, with no backtracking. Similar to git rebase
vs merge. You'd be able to tell what has been applied and reverted just
by looking at what modules are loaded.
But, I now realize that in order for that to be the case, we'd have to
disallow the unloading of replaced modules. But I think we're going to
end up *allowing* the unloading of replaced modules, right?
So maybe it's not worth it. I'll drop it.
> It would mean that the empty patch might need to be custom.
> Users probably would need to ask and wait for it.
>
>
> > My livepatch-related brain cache lines have been
> > flushed thanks to the aforementioned CVEs and my rapidly advancing
> > senility.
>
> Uff, I am not the only one.
:-)
> > > > The amount of flexibility we allow is kind of crazy, considering how
> > > > delicate of an operation live patching is. That reminds me that I
> > > > should bring up my other favorite idea at LPC: require modules to be
> > > > loaded before we "patch" them.
> > >
> > > We talked about this as well and if I remember correctly we came to a
> > > conclusion that it is all about a distribution and maintenance. We cannot
> > > ask customers to load modules they do not need just because we need to
> > > patch them.
> >
> > Fair enough.
> >
> > > One cumulative patch is not that great in this case. I remember you
> > > had a crazy idea how to solve it, but I don't remember details. My
> > > notes from the event say...
> > >
> > > - livepatch code complexity
> > > - make it synchronous with respect to modules loading
> > > - Josh's crazy idea
> > >
> > > That's not much :D
> > >
> > > So yes, we can talk about it and hopefully make proper notes this time.
> >
> > Heh, better notes would be good, otherwise I'll just keep complaining
> > about the same things every year :-) I'll try to remember what my crazy
> > idea was, or maybe come up with some new ones to keep it fresh.
>
> If we do not want to force users to load all patched modules then
> we would need to create a livepatch-per-module. This just moves
> the complexity somewhere else.
>
> One big problem would be how to keep the system consistent. You
> would need to solve races between loading modules and livepatches
> anyway.
>
> For example, you could not load fixed/patched modules when the system
> is not fully patched yet. You would need to load the module and
> the related livepatch at the same time and follow the consistency
> model as we do now.
>
> OK, there was the idea to refuse loading modules when livepatch
> transition is in progress. But it might not be acceptable,
> especially when the transition gets blocked infinitely
> and manual intervention would be needed.
>
> I agree that the current solution adds complexity to
> the livepatching code but it is not that complicated.
> Races with loading modules and livepatches in parallel
> are solved by mod->klp_active flag. There are no other
> races because all other operations are done on code
> that is not actively used. One good thing is that
> everything is in one place and kernel has it under
> control.
>
> I am open to discuss it. But we would need to come up with
> some clever solution.
Yeah, I think that's pretty much the crazy idea Miroslav mentioned. The
patch would consist of several modules. The parent module would
register the patch and patch vmlinux. Each child module would be
associated with a to-be-patched module. The child modules could be
loaded on demand, either by special klp code or by modprobe.
As you described, there would be some races to think about. However, it
would also have some benefits.
I *hope* it would mean we could get rid of a lot of our ugly hacks, like
- klp symbols, klp relas
- preserving ELF data, PLT's, other horrible arch-specific things
- klp.arch..altinstructions, klp.arch..parainstructions
- manually calling apply_relocate_add()
However... we might still need some of those things for another reason:
to bypass exported symbol protections. It needs some more
investigation.
Given this discussion, I'm thinking there wouldn't be much to discuss at
LPC for this topic unless we had a prototype to look at (which I won't
have time to do). So I may drop my talk in favor of giving more time
for other more tangible discussions.
--
Josh