Re: [PATCH v5 13/15] livepatch: change to a per-task consistency model

From: Miroslav Benes
Date: Fri Feb 17 2017 - 03:51:47 EST


On Thu, 16 Feb 2017, Josh Poimboeuf wrote:

> On Thu, Feb 16, 2017 at 03:33:26PM +0100, Miroslav Benes wrote:
> >
> > > @@ -347,22 +356,36 @@ static int __klp_enable_patch(struct klp_patch *patch)
> > >
> > > pr_notice("enabling patch '%s'\n", patch->mod->name);
> > >
> > > + klp_init_transition(patch, KLP_PATCHED);
> > > +
> > > + /*
> > > + * Enforce the order of the func->transition writes in
> > > + * klp_init_transition() and the ops->func_stack writes in
> > > + * klp_patch_object(), so that klp_ftrace_handler() will see the
> > > + * func->transition updates before the handler is registered and the
> > > + * new funcs become visible to the handler.
> > > + */
> > > + smp_wmb();
> > > +
> > > klp_for_each_object(patch, obj) {
> > > if (!klp_is_object_loaded(obj))
> > > continue;
> > >
> > > ret = klp_patch_object(obj);
> > > - if (ret)
> > > - goto unregister;
> > > + if (ret) {
> > > + pr_warn("failed to enable patch '%s'\n",
> > > + patch->mod->name);
> > > +
> > > + klp_cancel_transition();
> > > + return ret;
> > > + }
> >
> > [...]
> >
> > > +static void klp_complete_transition(void)
> > > +{
> > > + struct klp_object *obj;
> > > + struct klp_func *func;
> > > + struct task_struct *g, *task;
> > > + unsigned int cpu;
> > > +
> > > + if (klp_target_state == KLP_UNPATCHED) {
> > > + /*
> > > + * All tasks have transitioned to KLP_UNPATCHED so we can now
> > > + * remove the new functions from the func_stack.
> > > + */
> > > + klp_unpatch_objects(klp_transition_patch);
> > > +
> > > + /*
> > > + * Make sure klp_ftrace_handler() can no longer see functions
> > > + * from this patch on the ops->func_stack. Otherwise, after
> > > + * func->transition gets cleared, the handler may choose a
> > > + * removed function.
> > > + */
> > > + synchronize_rcu();
> > > + }
> > > +
> > > + if (klp_transition_patch->immediate)
> > > + goto done;
> > > +
> > > + klp_for_each_object(klp_transition_patch, obj)
> > > + klp_for_each_func(obj, func)
> > > + func->transition = false;
> > > +
> > > + /* Prevent klp_ftrace_handler() from seeing KLP_UNDEFINED state */
> > > + if (klp_target_state == KLP_PATCHED)
> > > + synchronize_rcu();
> > > +
> > > + read_lock(&tasklist_lock);
> > > + for_each_process_thread(g, task) {
> > > + WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
> > > + task->patch_state = KLP_UNDEFINED;
> > > + }
> > > + read_unlock(&tasklist_lock);
> > > +
> > > + for_each_possible_cpu(cpu) {
> > > + task = idle_task(cpu);
> > > + WARN_ON_ONCE(test_tsk_thread_flag(task, TIF_PATCH_PENDING));
> > > + task->patch_state = KLP_UNDEFINED;
> > > + }
> > > +
> > > +done:
> > > + klp_target_state = KLP_UNDEFINED;
> > > + klp_transition_patch = NULL;
> > > +}
> > > +
> > > +/*
> > > + * This is called in the error path, to cancel a transition before it has
> > > + * started, i.e. klp_init_transition() has been called but
> > > + * klp_start_transition() hasn't. If the transition *has* been started,
> > > + * klp_reverse_transition() should be used instead.
> > > + */
> > > +void klp_cancel_transition(void)
> > > +{
> > > + klp_target_state = !klp_target_state;
> > > + klp_complete_transition();
> > > +}
> >
> > If we fail to enable patch in __klp_enable_patch(), we call
> > klp_cancel_transition() and get to klp_complete_transition(). If the patch
> > has immediate set to true, the module would not be allowed to go (the
> > changes are in the last patch unfortunately, but my remark is closely
> > related to klp_cancel_transition() and __klp_enable_patch()). This could
> > annoy a user if it was due to a trivial reason. So we could call
> > module_put() in this case. It should be safe as no task could be in a new
> > function thanks to klp_ftrace_handler() logic.
> >
> > Pity I have not spotted this earlier.
> >
> > Putting module_put(patch->mod) right after klp_cancel_transition() call in
> > __klp_enable_patch() would be the simplest fix (as a part of 15/15 patch).
> > Maybe with a comment that it is safe to do it there.
> >
> > What do you think?
>
> Good catch. I agree that 15/15 should have something like that.
>
> Also, the module_put() will be needed for non-immediate patches which
> have a func->immediate set.

Right. This further complicates it.

> What do you think about the following? I tried to put the logic in
> klp_complete_transition(), so the module_put()'s would be in one place.
> But it was too messy, so I put it in klp_cancel_transition() instead.
>
> diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c
> index e96346e..bd1c1fd 100644
> --- a/kernel/livepatch/transition.c
> +++ b/kernel/livepatch/transition.c
> @@ -121,8 +121,28 @@ static void klp_complete_transition(void)
> */
> void klp_cancel_transition(void)
> {
> + bool immediate_func = false;
> +
> klp_target_state = !klp_target_state;
> klp_complete_transition();
> +
> + if (klp_target_state == KLP_PATCHED)
> + return;

This is not needed, I think. We call klp_cancel_transition() in
__klp_enable_patch() only. klp_target_state is set to KLP_PATCHED there
(through klp_init_transition()) and negated here. We know it must be
KLP_UNPATCHED.

Moreover, due to klp_complete_transition() klp_target_state is always
KLP_UNDEFINED after it.

> +
> + /*
> + * In the enable error path, even immediate patches can be safely
> + * removed because the transition hasn't been started yet.
> + *
> + * klp_complete_transition() doesn't have a module_put() for immediate
> + * patches, so do it here.
> + */
> + klp_for_each_object(klp_transition_patch, obj)
> + klp_for_each_func(obj, func)
> + if (func->immediate)
> + immediate_func = true;
> +
> + if (klp_transition_patch->immediate || immediate_func)
> + module_put(klp_transition_patch->mod);

Almost correct. The only problem is that klp_transition_patch is NULL at
this point. klp_complete_transition() does that and it should stay there
in my opinion to keep it simple.

So we could either move all this to __klp_enable_patch(), where patch
variable is defined, or we could store klp_transition_patch to a local
variable here in klp_cancel_transition() before klp_complete_transition()
is called. That should be safe. I like the latter more, because it keeps
the code in klp_cancel_transition() where it belongs.

Miroslav