Re: [RFC PATCH v1.9 00/14] livepatch: hybrid consistency model

From: Josh Poimboeuf
Date: Mon Apr 04 2016 - 13:03:16 EST


On Thu, Mar 31, 2016 at 02:54:26PM +0200, Miroslav Benes wrote:
>
> Hi,
>
> this is a great work. I'll have to review it properly (especially 13/14,
> probably several times as it is a heavy stuff), but I've gathered some
> notes so there they are.
>
> On Fri, 25 Mar 2016, Josh Poimboeuf wrote:
>
> > These patches are still a work in progress, but Jiri asked that I share
> > them before I go on vacation next week. Based on origin/master because
> > it has CONFIG_STACK_VALIDATION.
> >
> > This has two consistency models: the immediate model (like in today's
> > code) and the new kpatch/kgraft hybrid model.
> >
> > The default is the hybrid model: kgraft's per-task consistency and
> > syscall barrier switching combined with kpatch's stack trace switching.
> > There are also a number of fallback options which make it pretty
> > flexible, yet the code is still quite simple.
> >
> > Patches are applied on a per-task basis, when the task is deemed safe to
> > switch over. It uses a tiered approach to determine when a task is safe
> > and can be switched.
> >
> > The first wave of attack is stack checking of sleeping tasks. If no
> > affected functions are on the stack of a given task, the task is
> > switched. In most cases this will patch most or all of the tasks on the
> > first try. Otherwise it'll keep trying periodically. This option is
> > only available if the architecture has reliable stacks
> > (CONFIG_RELIABLE_STACKTRACE and CONFIG_STACK_VALIDATION).
>
> I think we could allow periodic checking even for
> !CONFIG_RELIABLE_STACKTRACE situations. The problematic task could migrate
> by itself after some time (meaning it woke up meanwhile and sleeps
> somewhere else now, or it went through a syscall boundary). So we can
> gain something, especially when combined with a fake signal approach. More
> on that below and in my 13/14 mail.

Good idea, agreed.

> > The next line of attack, if needed, is syscall/IRQ switching. A task is
> > switched when it returns from a system call or a user space IRQ. This
> > approach is less than ideal because it usually requires signaling tasks
> > to get them to switch. It also doesn't work for kthreads. But it's
> > still useful in some cases when user tasks get stuck sleeping on an
> > affected function.
> >
> > For architectures which don't have reliable stacks, users have two
> > options:
> >
> > a) use the hybrid fallback option of using only the syscall/IRQ
> > switching (which is the default);
> >
> > b) or they can use the immediate model (which is the model we already
> > have today) by setting the patch->immediate flag.
> >
> > There's also a func->immediate flag which allows users to specify that
> > certain functions in the patch can be applied without per-task
> > consistency. This might be useful if you want to patch a common
> > function like schedule(), and the function change doesn't need
> > consistency but the rest of the patch does.
> >
> > Still have a lot of TODOs, some of them are listed here. If you see
> > something objectionable, it might be a good idea to make sure it's not
> > already on the TODO list :-)
> >
> > TODO:
> > - actually test it
> > - don't use TIP_KLP_NEED_UPDATE in arch-independent code
> > - figure out new API to keep the use of task_rq_lock() in the sched code
>
> Hm, no idea how to do it so that everyone is satisfied. I still remember
> Peter's protests.

Yeah, I'll loop in Peter and Ingo on v2 so we can get their input.

> > - cleaner way to detect preemption on the stack
> > - allow patch modules to be removed. still needs more discussion and
> > thought. maybe something like Miroslav's patch would be good:
> > https://lkml.kernel.org/r/alpine.LNX.2.00.1512150857510.24899@xxxxxxxxxxxxx
>
> Yup, that could be part of the patch set. The second option (to rework
> klp_unregister_patch and move kobject_put out of mutex protected parts) is
> maybe a way to go. The mutex_trylock approach works as well, but it is not
> clean and nice enough I guess. However the patch is there :).
>
> Anyway the module removal should be prohibited when one uses immmediate
> flag set to true. Without consistency model we cannot say if it is safe to
> remove the module. Some process could still be in its code.

Good point.

>
> > - come up with a better name than universe? KLP_STATE_PREV/NEXT?
> > KLP_UNPATCHED/PATCHED? there were some objections to the name in v1.
> > - update documentation for sysfs, proc, livepatch
> > - need atomic accesses or READ_ONCE/WRITE_ONCE anywhere?
> > - ability to force a patch to the goal universe
>
> This could be made by a call to klp_update_task_universe for all tasks,
> couldn't it? We have something similar in kgraft.

Yeah, I think so.

> > - try ftrace handler switching idea from v1 cover letter
> > - split up the patches better
> > - cc all the right people
>
> I'd add a fake signal facility for sleeping non-migrated tasks. This
> would accelerate a migration to a new universe. We have it in kgraft for
> quite some time and it worked out great. See
> lkml.kernel.org/r/1430739625-4658-9-git-send-email-jslaby@xxxxxxx which
> went with Jiri's kgraft-on-klp patch set. See also Oleg's reply as it is
> important (I changed kgraft implementation according to that).

Ok, I'll look into sending a fake signal to remaining tasks.

--
Josh