Re: [BUG] stack tracing causes: kernel/module.c:271 module_assert_mutex_or_preempt

From: Steven Rostedt
Date: Wed Apr 05 2017 - 22:12:40 EST


On Wed, 5 Apr 2017 10:59:25 -0700
"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> > > Could you please let me know if tracing happens in NMI handlers?
> > > If so, a bit of additional code will be needed.
> > >
> > > Thanx, Paul
> > >
> > > PS. Which reminds me, any short-term uses of RCU_TASKS? This represents
> > > 3 of my 16 test scenarios, which is getting hard to justify for
> > > something that isn't used. Especially given that I will need to
> > > add more scenarios for parallel-callbacks SRCU...
> >
> > The RCU_TASK implementation is next on my todo list. Yes, there's going
> > to be plenty of users very soon. Not for 4.12 but definitely for 4.13.
> >
> > Sorry for the delay in implementing that :-/
>
> OK, I will wait a few months before checking again...
>

Actually, I took a quick look at what needs to be done, and I think it
is *really* easy, and may be available in 4.12! Here's the current
patch.

I can probably do a patch to allow optimized kprobes on PREEMPT kernels
as well.

-- Steve

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 8efd9fe..28e3019 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2808,18 +2808,28 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
* callers are done before leaving this function.
* The same goes for freeing the per_cpu data of the per_cpu
* ops.
- *
- * Again, normal synchronize_sched() is not good enough.
- * We need to do a hard force of sched synchronization.
- * This is because we use preempt_disable() to do RCU, but
- * the function tracers can be called where RCU is not watching
- * (like before user_exit()). We can not rely on the RCU
- * infrastructure to do the synchronization, thus we must do it
- * ourselves.
*/
if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_PER_CPU)) {
+ /*
+ * We need to do a hard force of sched synchronization.
+ * This is because we use preempt_disable() to do RCU, but
+ * the function tracers can be called where RCU is not watching
+ * (like before user_exit()). We can not rely on the RCU
+ * infrastructure to do the synchronization, thus we must do it
+ * ourselves.
+ */
schedule_on_each_cpu(ftrace_sync);

+#ifdef CONFIG_PREEMPT
+ /*
+ * When the kernel is preeptive, tasks can be preempted
+ * while on a ftrace trampoline. Just scheduling a task on
+ * a CPU is not good enough to flush them. Calling
+ * synchronize_rcu_tasks() will wait for those tasks to
+ * execute and either schedule voluntarily or enter user space.
+ */
+ synchronize_rcu_tasks();
+#endif
arch_ftrace_trampoline_free(ops);

if (ops->flags & FTRACE_OPS_FL_PER_CPU)
@@ -5366,22 +5376,6 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)

static void ftrace_update_trampoline(struct ftrace_ops *ops)
{
-
-/*
- * Currently there's no safe way to free a trampoline when the kernel
- * is configured with PREEMPT. That is because a task could be preempted
- * when it jumped to the trampoline, it may be preempted for a long time
- * depending on the system load, and currently there's no way to know
- * when it will be off the trampoline. If the trampoline is freed
- * too early, when the task runs again, it will be executing on freed
- * memory and crash.
- */
-#ifdef CONFIG_PREEMPT
- /* Currently, only non dynamic ops can have a trampoline */
- if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
- return;
-#endif
-
arch_ftrace_update_trampoline(ops);
}