live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())

From: Ingo Molnar
Date: Fri Feb 20 2015 - 05:44:28 EST

Next message: Andrew Cooper: "Re: [Xen-devel] NUMA_BALANCING and Xen PV guest regression in 3.20-rc0"
Previous message: Michal Simek: "Re: [PATCH v4] dma: Add Xilinx AXI Direct Memory Access Engine driver support"
In reply to: Jiri Kosina: "Re: [PATCH 1/3] sched: add sched_task_call()"
Next in thread: Jiri Kosina: "Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

* Jiri Kosina <jkosina@xxxxxxx> wrote:

> On Fri, 20 Feb 2015, Ingo Molnar wrote:
>
> > So if your design is based on being able to discover
>
> > 'live' functions in the kernel stack dump of all tasks
> > in the system, I think you need a serious reboot of the
> > whole approach and get rid of that fragility before any
> > of that functionality gets upstream!
>
> So let me repeat again, just to make sure that no more
> confusion is being spread around -- there are aproaches
> which do rely on stack contents, and aproaches which
> don't. kpatch (the Red Hat solution) and ksplice (the
> Oracle solution) contains stack analysis as a conceptual
> design step, kgraft (the SUSE solution) doesn't.

So just to make my position really clear: any talk about
looking at the kernel stack for backtraces is just crazy
talk, considering how stack backtrace technology stands
today and in the reasonable near future!

With that out of the way, the only safe mechanism to live
patch the kernel (for sufficiently simple sets of changes
to singular functions) I'm aware of at the moment is:

- forcing all user space tasks out of kernel mode and
intercepting them in a safe state. I.e. making sure that
no kernel code is executed, no kernel stack state is
used (modulo code closely related to the live
patching mechanism and kernel threads in safe state,
lets ignore them for this argument)

There's two variants of this concept, which deals with the
timing of how user-space tasks are forced into user mode:

- the simple method: force all user-space tasks out of
kernel mode, stop the machine for a brief moment and be
done with the patching safely and essentially
atomically.

- the complicated method spread out over time: uses the
same essential mechanism plus the ftrace patching
machinery to detect whether all tasks have transitioned
through a version flip. [this is what kgraft does in
part.]

All fundamental pieces of the simple method are necessary
to get guaranteed time transition from the complicated
method: task tracking and transparent catching of them,
handling kthreads, etc.

My argument is that the simple method should be implemented
first and foremost.

Then people can do add-on features to possibly spread out
the new function versions in a more complicated way if they
want to avoid the stop-all-tasks transition - although I'm
not convinced about it: I'm sure sure many sysadmins would
like the bug patching to be over with quickly and not have
their systems in an intermediate state like kgraft does it.

In any case, as per my arguments above, examining the
kernel stack is superfluous (so we won't be exposed to the
fragility of it either): there's no need to examine it and
writing such patches is misguided...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Andrew Cooper: "Re: [Xen-devel] NUMA_BALANCING and Xen PV guest regression in 3.20-rc0"
Previous message: Michal Simek: "Re: [PATCH v4] dma: Add Xilinx AXI Direct Memory Access Engine driver support"
In reply to: Jiri Kosina: "Re: [PATCH 1/3] sched: add sched_task_call()"
Next in thread: Jiri Kosina: "Re: live patching design (was: Re: [PATCH 1/3] sched: add sched_task_call())"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]