Re: barriers: was: [RFC PATCH v2 17/18] livepatch: change to a per-task consistency model

From: Josh Poimboeuf
Date: Wed May 04 2016 - 13:25:25 EST


On Wed, May 04, 2016 at 04:12:05PM +0200, Petr Mladek wrote:
> On Wed 2016-05-04 14:39:40, Petr Mladek wrote:
> > *
> > * Note that the task must never be migrated to the target
> > * state when being inside this ftrace handler.
> > */
> >
> > We might want to move the second paragraph on top of the function.
> > It is a basic and important fact. It actually explains why the first
> > read barrier is not needed when the patch is being disabled.
>
> I wrote the statement partly intuitively. I think that it is really
> somehow important. And I am slightly in doubts if we are on the safe side.
>
> First, why is it important that the task->patch_state is not switched
> when being inside the ftrace handler?
>
> If we are inside the handler, we are kind-of inside the called
> function. And the basic idea of this consistency model is that
> we must not switch a task when it is inside a patched function.
> This is normally decided by the stack.
>
> The handler is a bit special because it is called right before the
> function. If it was the only patched function on the stack, it would
> not matter if we choose the new or old code. Both decisions would
> be safe for the moment.
>
> The fun starts when the function calls another patched function.
> The other patched function must be called consistently with
> the first one. If the first function was from the patch,
> the other must be from the patch as well and vice versa.
>
> This is why we must not switch task->patch_state dangerously
> when being inside the ftrace handler.
>
> Now I am not sure if this condition is fulfilled. The ftrace handler
> is called as the very first instruction of the function. Does not
> it break the stack validity? Could we sleep inside the ftrace
> handler? Will the patched function be detected on the stack?
>
> Or is my brain already too far in the fantasy world?

I think this isn't a possibility.

In today's code base, this can't happen because task patch states are
only switched when sleeping or when exiting the kernel. The ftrace
handler doesn't sleep directly.

If it were preempted, it couldn't be switched there either because we
consider preempted stacks to be unreliable.

In theory, a DWARF stack trace of a preempted task *could* be reliable.
But then the DWARF unwinder should be smart enough to see that the
original function called the ftrace handler. Right? So the stack would
be reliable, but then livepatch would see the original function on the
stack and wouldn't switch the task.

Does that make sense?

--
Josh