Re: [RFC PATCH v1.9 08/14] livepatch: separate enabled and patched states

From: Chris J Arges
Date: Tue Apr 12 2016 - 13:36:05 EST


On Tue, Apr 12, 2016 at 12:16:00PM -0500, Josh Poimboeuf wrote:
> On Tue, Apr 12, 2016 at 09:44:43AM -0500, Chris J Arges wrote:
> > On Fri, Mar 25, 2016 at 02:34:55PM -0500, Josh Poimboeuf wrote:
> > > Once we have a consistency model, patches and their objects will be
> > > enabled and disabled at different times. For example, when a patch is
> > > disabled, its loaded objects' funcs can remain registered with ftrace
> > > indefinitely until the unpatching operation is complete and they're no
> > > longer in use.
> > >
> > > It's less confusing if we give them different names: patches can be
> > > enabled or disabled; objects (and their funcs) can be patched or
> > > unpatched:
> > >
> > > - Enabled means that a patch is logically enabled (but not necessarily
> > > fully applied).
> > >
> > > - Patched means that an object's funcs are registered with ftrace and
> > > added to the klp_ops func stack.
> > >
> > > Also, since these states are binary, represent them with booleans
> > > instead of ints.
> > >
> >
> > Josh,
> >
> > Awesome work here first of all!
> >
> > Looking through the patchset a bit I see the following bools:
> > - functions: patched, transitioning
> > - objects: patched
> > - patches: enabled
> >
> > It seems this reflects the following states at a patch level:
> > disabled - module inserted, not yet logically enabled
> > enabled - logically enabled, but not all objects/functions patched
> > transitioning - objects/functions are being applied or reverted
> > patched - all objects/functions patched
> >
> > However each object and function could have the same state and the parent object
> > just reflects the 'aggregate state'. For example if all funcs in an object are
> > patched then the object is also patched.
> >
> > Perhaps we need more states (or maybe there will be more in the future), but
> > wouldn't this just be easier to have something like for each patch, object, and
> > function?
> >
> > enum klp_state{
> > KLP_DISABLED,
> > KLP_ENABLED,
> > KLP_TRANSITION,
> > KLP_PATCHED,
> > }
> >
> >
> > I'm happy to help out too.
>
> Thanks for the comments. First let me try to explain why I chose two
> bools rather than a single state variable.
>
> At a func level, it's always in one of the following states:
>
> patched=0 transition=0: unpatched
> patched=0 transition=1: unpatched, temporary starting state
> patched=1 transition=1: patched, may be visible to some tasks
> patched=1 transition=0: patched, visible to all tasks
>
> And for unpatching, it goes in the reverse order:
>
> patched=1 transition=0: patched, visible to all tasks
> patched=1 transition=1: patched, may be visible to some tasks
> patched=0 transition=1: unpatched, temporary ending state
> patched=0 transition=0: unpatched
>
> (note to self, put the above in a comment somewhere)
>

This is helpful and makes more sense.

> Now, we could convert the states from two bools into a single enum. But
> I think it would complicate the code. Because nowhere in the code does
> it need to access the full state. In some places it accesses
> func->patched and in other places it accesses func->transition, but it
> never needs to access both.
>
> So for example, the following check in klp_ftrace_handler():
>
> if (unlikely(func->transition)) {
>
> would change to:
>
> if (unlikely(func->state == KLP_ENABLED || func->state == KLP_TRANSITION)) {
>
> Sure, we could use a helper function to make that more readable. But
> with the bools its clearer and you don't need a helper function.
>
> As another example, see the following code in klp_complete_transition():
>
> klp_for_each_object(klp_transition_patch, obj)
> klp_for_each_func(obj, func)
> func->transition = false;
>
> The last line would have to be changed to something like:
>
> if (patching...)
> func->state = KLP_PATCHED;
> else /* unpatching */
> func->state = KLP_DISABLED;
>
> So that's why I picked two bools over a single state variable: it seems
> to make the code simpler.
>
> As to the other idea about copying the func states to the object and
> patch level, I get the feeling that would also complicate the code. We
> patch and transition at a function granularity, so the "real" state is
> at the func level. Proliferating that state to objects and patches
> might be tricky to get right, and it could make it harder to understand
> exactly what's going on in the code. And I don't really see a benefit
> to doing that.
>
> --
> Josh
>

I think keeping it simple makes a ton of sense, thanks for the explanation.

I'm also envisioning how to troubleshoot patches that don't converge.

So if someone wanted to check if a function has been fully patched on all tasks
they would check:

/sys/kernel/livepatch/<patch>/transition
/sys/kernel/livepatch/<patch>/patched

And see if patched=1 and transition=0 things are applied.

However if patched=1 and transition=1 then if a user wanted to dig down and see
which pid's we were waiting on we could do:

cat /proc/*/patch_status

and check if any pid's patch_status values still contain KLP_UNIVERSE_OLD.

If we wanted to see which function in the task needs patching we could:
cat /proc/<pid>/stack
and see if anything in that stack contains the functions in question.

Anyway I'll keep looking at this patchset, patching using the samples/livepatch
code works for me without issue so far.

--chris