Re: [PATCH 2/3] livepatch: Allow user to specify functions to search for on a stack

From: Miroslav Benes
Date: Thu Nov 25 2021 - 05:22:12 EST


On Thu, 25 Nov 2021, Petr Mladek wrote:

> On Mon 2021-11-22 10:53:21, Joe Lawrence wrote:
> > On 11/22/21 2:57 AM, Miroslav Benes wrote:
> > > On Fri, 19 Nov 2021, Josh Poimboeuf wrote:
> > >
> > >> Thanks for doing this! And at peterz-esque speed no less :-)
> > >>
> > >> On Fri, Nov 19, 2021 at 10:03:26AM +0100, Miroslav Benes wrote:
> > >>> livepatch's consistency model requires that no live patched function
> > >>> must be found on any task's stack during a transition process after a
> > >>> live patch is applied. It is achieved by walking through stacks of all
> > >>> blocked tasks.
> > >>>
> > >>> The user might also want to define more functions to search for without
> > >>> them being patched at all. It may either help with preparing a live
> > >>> patch, which would otherwise require additional touches to achieve the
> > >>> consistency
> > >>
> > >> Do we have any examples of this situation we can add to the commit log?
> > >
> > > I do not have anything at hand. Joe, do you remember the case you
> > > mentioned previously about adding a nop to a function?
> >
> > Maybe adding a hypothetical scenario to the commit log would suffice?
>
> I wonder if we could describe a scenario based on the thread about
> .cold code variants, see
> https://lore.kernel.org/all/20211112015003.pefl656m3zmir6ov@treble/
>
> This feature would allow to safely livepatch already released
> kernels where the unwinder is not able to reliably detect
> a newly discovered problems.

and is described (well, without actually using .cold suffix) a few lines
below. I'll try to improve the changelog.

> > >>> or it can be used to overcome deficiencies the stack
> > >>> checking inherently has. For example, GCC may optimize a function so
> > >>> that a part of it is moved to a different section and the function would
> > >>> jump to it. This child function would not be found on a stack in this
> > >>> case, but it may be important to search for it so that, again, the
> > >>> consistency is achieved.
> > >>>
> > >>> Allow the user to specify such functions on klp_object level.
> > >>>
> > >>> Signed-off-by: Miroslav Benes <mbenes@xxxxxxx>
> > >>> ---
> > >>> include/linux/livepatch.h | 11 +++++++++++
> > >>> kernel/livepatch/core.c | 16 ++++++++++++++++
> > >>> kernel/livepatch/transition.c | 21 ++++++++++++++++-----
> > >>> 3 files changed, 43 insertions(+), 5 deletions(-)
> > >>>
> > >>> diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h
> > >>> index 2614247a9781..89df578af8c3 100644
> > >>> --- a/include/linux/livepatch.h
> > >>> +++ b/include/linux/livepatch.h
> > >>> @@ -106,9 +106,11 @@ struct klp_callbacks {
> > >>> * struct klp_object - kernel object structure for live patching
> > >>> * @name: module name (or NULL for vmlinux)
> > >>> * @funcs: function entries for functions to be patched in the object
> > >>> + * @funcs_stack: function entries for functions to be stack checked
> > >>
> > >> So there are two arrays/lists of 'klp_func', and two implied meanings of
> > >> what a 'klp_func' is and how it's initialized.
> > >>
> > >> Might it be simpler and more explicit to just add a new external field
> > >> to 'klp_func' and continue to have a single 'funcs' array? Similar to
> > >> what we already do with the special-casing of 'nop', except it would be
> > >> an external field, e.g. 'no_patch' or 'stack_only'.
> >
> > I'll add that the first thing that came to mind when you raised this
> > feature idea in the other thread was to support existing klp_funcs array
> > with NULL new_func's.
>
> Please, solve this with the extra flag, e.g. .stack_only, as
> already suggested. It will help to distinguish mistakes and
> intentions. Also it will allow to find these symbols by grep.

Indeed, that is what I did for v2. I used new_func being NULL fact even in
v1, but I do not like it much. Extra flag is definitely more robust.

> > I didn't go look to see how invasive it would be,
> > but it will be interesting to see if a single list approach turns out
> > any simpler for v2.
>
> I am not sure either. But I expect that it will be easier than
> the extra array.

So, extra flag and one array makes certain things (initialization)
definitely easier. On the other hand, there are suddenly more problems to
think about (and I haven't solved them yet):

- I need to make it work with nops functions. Especially if we allow a
scenario where there could be klp_object instance with just stack_only
functions. Would that be useful? For example, to patch something in a
module and add a stack_only for a function in vmlinux.

If yes, then the interaction with nops is not completely
straightforward and also some parts of the code would have to be
changed (for example how obj->patched flag is handled).

- klp_func instances are directly mirrored in sysfs. Do we want to keep
stack_only functions there too? If not, it makes the whole thing
slighly more difficult given how we manage kobjects.

Nothing really difficult to implement if we come up with answers.

Miroslav