Re: [RFC PATCH 6/9] livepatch: create per-task consistency model

From: Josh Poimboeuf
Date: Wed Feb 11 2015 - 15:19:35 EST


On Wed, Feb 11, 2015 at 11:21:51AM +0100, Miroslav Benes wrote:
>
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
>
> [...]
>
> > @@ -38,14 +39,34 @@ static void notrace klp_ftrace_handler(unsigned long ip,
> > ops = container_of(fops, struct klp_ops, fops);
> >
> > rcu_read_lock();
> > +
> > func = list_first_or_null_rcu(&ops->func_stack, struct klp_func,
> > stack_node);
> > - rcu_read_unlock();
> >
> > if (WARN_ON_ONCE(!func))
> > - return;
> > + goto unlock;
> > +
> > + if (unlikely(func->transition)) {
> > + /* corresponding smp_wmb() is in klp_init_transition() */
> > + smp_rmb();
> > +
> > + if (current->klp_universe == KLP_UNIVERSE_OLD) {
> > + /*
> > + * Use the previously patched version of the function.
> > + * If no previous patches exist, use the original
> > + * function.
> > + */
> > + func = list_entry_rcu(func->stack_node.next,
> > + struct klp_func, stack_node);
> > +
> > + if (&func->stack_node == &ops->func_stack)
> > + goto unlock;
> > + }
> > + }
> >
> > klp_arch_set_pc(regs, (unsigned long)func->new_func);
> > +unlock:
> > + rcu_read_unlock();
> > }
>
> I decided to understand the code more before answering the email about the
> race and found another problem. I think.
>
> Imagine we patched some function foo() with foo_1() from patch_1 and now
> we'd like to patch it again with foo_2() in patch_2. __klp_enable_patch
> calls klp_init_transition which sets klp_universe for all processes to
> KLP_UNIVERSE_OLD and marks the foo_2() for transition (it is gonna be 1).
> Then __klp_enable_patch adds foo_2() to the RCU-protected list for foo().
> BUT what if somebody calls foo() right between klp_init_transition and
> the loop in __klp_enable_patch? The ftrace handler first returns the
> first entry in the list which is foo_1() (foo_2() is still not present),
> then it checks for func->transition. It is 1.

No, actually foo_1()'s func->transition will be 0. Only foo_2()'s
func->transition will be 1.

> It checks for
> current->klp_universe which is KLP_UNIVERSE_OLD and so the next entry is
> retrieved. There is no such and therefore foo() is called. This is
> obviously wrong because foo_1() was expected.
>
> Everything would work fine if one would call foo() before
> klp_start_transition and after the loop in __klp_enable_patch. The
> solution might be to move the setting of func->transition to
> klp_start_transition, but this could break something different. I don't
> know yet.
>
> Am I wrong?
>
> Miroslav

--
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/