Re: A bug in ftrace - dynamic fops

From: Steven Rostedt
Date: Mon Aug 08 2016 - 10:53:12 EST


On Mon, 8 Aug 2016 10:57:45 +0200 (CEST)
Miroslav Benes <mbenes@xxxxxxx> wrote:

> Hi Steven,
>
> I am afraid there is a bug in the current mainline's ftrace when dynamic
> fops are involved.

I'm sorry but I don't see it.

>
> ftrace_shutdown() relies on schedule_on_each_cpu() which should ensure
> that no task stays in a ftrace handler. This was sufficient for a long
> time because every handler was called with the preemption disabled thanks
> to ftrace_ops_list_func (or ftrace_ops_assist_func). Dynamic trampolines
> did not change the behaviour because !PREEMPT was required (commit
> 12cce594fa8f ("ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use
> allocated trampolines")).
>
> Situation changed with the commit 00ccbf2f5b75 ("ftrace/x86: Let dynamic
> trampolines call ops->func even for dynamic fops"). The purpose of the
> patch is clear - to call ops->func whenever possible and thus gain an
> advantage of dynamic trampolines. But it also allows the handler (that
> very ops->func) to sleep because no atomic context is enforced. This
> breaks the assumption for schedule_on_each_cpu() in ftrace_shutdown() and
> one can crash the kernel quite easily.

Not when CONFIG_PREEMPT is not enabled. This is because the kernel does
not preempt unless it specifically asks to be preempted. If ops->func()
calls a mutex, or sleeps, then *THAT* is a bug! ops->func() is more
sensitive than interrupt handlers, and all ops->func()s must use extra
care. This sounds like a "doctor it hurts me when I do this" bug (where
the doctor replies "well, don't do that").

If something registers an ops->func() that can sleep, then it MUST limit
the functions that it can be registered to, and also must handle any
synchronization of the ops itself being freed. This isn't the ftrace
infrastructure's responsibility.


>
> It suffices to register dynamic fops with FTRACE_OPS_FL_RECURSION_SAFE set
> (because otherwise ftrace_ops_assist_func() is used which also disables
> preemption), sleep in the handler and meanwhile remove it.
>
> I can imagine two reasonable solutions...
>
> 1. introduce something similar to ftrace_ops_assist_func() which would
> just disable preemption before calling ops->func and enable it afterwards.

With CONFIG_PREEMPT disabled, how can ops->func() sleep? It can't,
unless it specifically calls a function that can. I don't see the bug
you mention here.


>
> or
>
> 2. implement the whole thing through RCU_TASKS. This would also enable
> dynamic trampolines for PREEMPT kernels.

That said, this has been on my todo list for too long, and will soon be
implemented. I want CONFIG_PREEMPT to allow dynamic trampolines for
dynamic ops too. Not to mention, RCU_TASKS was specifically written for
me to do this. I dropped the ball on this one. :-p

-- Steve



>
> Revert of 00ccbf2f5b75 commit would be solution as well but there is a
> drawback.
>
> Regards,
> Miroslav