Re: Re: [RFC PATCH 0/2] kpatch: dynamic kernel patching

From: Vojtech Pavlik
Date: Sat May 17 2014 - 18:46:16 EST


On Sat, May 17, 2014 at 03:09:57AM +0900, Masami Hiramatsu wrote:
> (2014/05/17 1:27), Jiri Kosina wrote:
> > On Tue, 6 May 2014, Steven Rostedt wrote:
> >
> >>> However, I also think if users can accept such freezing wait-time,
> >>> it means they can also accept kexec based "checkpoint-restart" patching.
> >>> So, I think the final goal of the kpatch will be live patching without
> >>> stopping the machine. I'm discussing the issue on github #138, but that is
> >>> off-topic. :)
> >>
> >> I agree with Ingo too. Being conservative at first is the right
> >> approach here. We should start out with a stop_machine making sure that
> >> everything is sane before we continue. Sure, that's not much different
> >> than a kexec, but lets take things one step at a time.
> >>
> >> ftrace did the stop_machine (and still does for some archs), and slowly
> >> moved to a more efficient method. kpatch/kgraft should follow suit.
> >
> > I don't really agree here.
> >
> > I actually believe that "lazy" switching kgraft is doing provides a little
> > bit more in the sense of consistency than stop_machine()-based aproach.
> >
> > Consider this scenario:
> >
> > void foo()
> > {
> > for (i=0; i<10000; i++) {
> > bar(i);
> > something_else(i);
> > }
> > }
>
> In this case, I'd recommend you to add foo() to replacing target
> as dummy. Then, kpatch can ensure foo() is actually not running. :)

The problem is: Where do you stop?

Adding the whole list of functions that transitively may ever call bar()
can be pretty much the whole kernel. And given that you can be calling
via pointer, you can't often even tell who is calling bar().

So yes, a developer could manually include foo() in the to be patched
list, so that it is checked that foo() isn't being executed while
patching.

But that would have to be done entirely manually after thoroughly
understanding the implications of the patch.

> > Let's say you want to live-patch bar(). With stop_machine()-based aproach,
> > you can easily end-up with old bar() and new bar() being called in two
> > consecutive iterations before the loop is even exited, right? (especially
> > on preemptible kernel, or if something_else() goes to sleep).
> >
> > With lazy-switching implemented in kgraft, this can never happen.
>
> And I guess similar thing may happen with kgraft. If old function and
> new function share a non-auto variable and they modify it different way,
> the result will be unexpected by the mutual interference.

By non-auto I assume you mean globally accessible variable or data
structures.

Yes, with kGraft the new function has to be backward compatible with
pre-patch global data layout and semantics.

The parameters, return value and their semantics are free to change,
though, and kGraft guarantees consistency there.

--
Vojtech Pavlik
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/