Re: A bug in ftrace - dynamic fops

From: Steven Rostedt
Date: Thu Aug 11 2016 - 10:33:40 EST


On Thu, 11 Aug 2016 16:08:58 +0200 (CEST)
Miroslav Benes <mbenes@xxxxxxx> wrote:

> /*
> * Dynamic ops may be freed, we must make sure that all
> * callers are done before leaving this function.
> * The same goes for freeing the per_cpu data of the per_cpu
> * ops.
> *
> * Again, normal synchronize_sched() is not good enough.
> * We need to do a hard force of sched synchronization.
> * This is because we use preempt_disable() to do RCU, but
> * the function tracers can be called where RCU is not watching
> * (like before user_exit()). We can not rely on the RCU
> * infrastructure to do the synchronization, thus we must do it
> * ourselves.
> */
> if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_PER_CPU)) {
> schedule_on_each_cpu(ftrace_sync);
>
> arch_ftrace_trampoline_free(ops);
>
> if (ops->flags & FTRACE_OPS_FL_PER_CPU)
> per_cpu_ops_free(ops);
> }
>
> I think the wording could be interpreted in a way that ftrace is
> responsible which is not true according to you.

OK, then I should update it to be a bit more clear, that it is only
worried about its own infrastructure and not what goes on within
ops->func().

I take feedback like yours seriously. If you are confused by it, then
others may be too.

Thus, I think there should be two points documented a bit better.

#1, ops->func() is special, and can be called from any context
including NMI. That means unless you take special care about exactly
what functions are going to be traced (via the filter hashes, which are
not even supported when DYNAMIC_FTRACE is not set), then the
ops->func() must be treated with the same care as an NMI handler would
be. (no locking, no sleeping, etc).

#2, the only synchronization that ftrace will take care of is its own.
That is, the dynamic trampolines have race conditions in the
implementation. The above comment is all about handling its own race
conditions, and doesn't care about what goes on within ops->func().
Although it does give a utility if ops->func() is not recursion safe,
but other than than, like all other function hooks, any synchronization
within the hooks must be taken care of by the caller to
(un)register_ftrace_function(). That includes hooks doing strange
things (like sleeping) and then freeing the ops after the registering.

Although, with #1, #2 may not be needed.

-- Steve