Re: There is a Tasks RCU stall warning

From: Paul E. McKenney
Date: Wed Apr 12 2017 - 10:19:53 EST


On Wed, Apr 12, 2017 at 09:18:21AM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2017 20:23:07 -0700
> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>
> > But another question...
> >
> > Suppose someone traced or probed or whatever a call to (say)
> > cond_resched_rcu_qs(). Wouldn't that put the call to this
> > function in the trampoline itself? Of course, if this happened,
> > life would be hard when the trampoline was freed due to
> > cond_resched_rcu_qs() being a quiescent state.
>
> Not at all, because the trampoline happens at the beginning of the
> function. Not in the guts of it (unless something in the guts was
> traced). But even then, it should be fine as the change was already
> made.
>
> /* unhook trampoline from function calls */
> unregister_ftrace_function(my_ops);
>
> synchronize_rcu_tasks();
>
> kfree(my_ops->trampoline);
>
>
> Thus, once the unregister_ftrace_function() is called, no new entries
> into the trampoline can happen. The synchronize_rcu_tasks() is to move
> those that are currently on a trampoline off.

OK, good! (I thought that these things could appear anywhere.)

If it ever becomes necessary, I suppose you could have a function
call as the very last thing on a trampoline. Do the (off-trampoline)
return-address push, jump at the function, and that is the last need
for the trampoline.

Assuming that the called function doesn't try accessing the code
surrounding the call, but that would be a problem in any case.

> Is there a way that a task could be in the middle of
> cond_resched_rcu_qs() and get preempted by something while on the
> ftrace trampoline, then the above "unregister_ftrace_function()" and
> "synchronize_rcu_tasks()" can be called and finish, while the one task
> is still on the trampoline and never finished the cond_resched_rcu_qs()?

Well, if the kernel being ftraced is a guest OS and the hypervisor
preempts it at just that point...

> > Or is there something that takes care to avoid putting calls to
> > this sort of function (and calls to any function calling this sort
> > of function, directly or indirectly) into a trampoline?
>
> The question is, if its on the trampoline in one of theses functions
> when synchronize_rcu_tasks() is called, will it still be on the
> trampoline when that returns?

If the function's return address is within the trampoline, it seems to
me that bad things could happen.

Thanx, Paul