Re: [BUG] stack tracing causes: kernel/module.c:271 module_assert_mutex_or_preempt
From: Steven Rostedt
Date: Wed Apr 05 2017 - 15:21:57 EST
On Wed, 5 Apr 2017 12:08:10 -0700
"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Wed, Apr 05, 2017 at 02:54:25PM -0400, Steven Rostedt wrote:
> > On Wed, 5 Apr 2017 10:59:25 -0700
> > "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > > > Note, this has nothing to do with trace_rcu_dyntick(). It's the
> > > > function tracer tracing inside RCU, calling the stack tracer to record
> > > > a new stack if it sees its larger than any stack before. All I need is
> > > > a way to tell the stack tracer to not record a stack if it is in this
> > > > RCU critical section.
> > > >
> > > > If you can add a "in_rcu_critical_section()" function, that the stack
> > > > tracer can test, and simply exit out like it does if in_nmi() is set,
> > > > that would work too. Below is my current work around.
> > >
> > > Except that the rcu_irq_enter() would already have triggered the bug
> > > that was (allegedly) fixed by my earlier patch. So, yes, the check for
> > > rcu_is_watching() would work around this bug, but the hope is that
> > > with my earlier fix, this workaround would not be needed.
> >
> > Note, if I had a "in_rcu_critical_section()" I wouldn't need to call
> > rcu_irq_enter(). I could fall out before that. My current workaround
> > does the check, even though it breaks things, it would hopefully fix
> > things as it calls rcu_irq_exit() immediately.
>
> OK, color me confused. What would "in_rcu_critical_section()" do?
>
> The rcu_is_watching() function tells you that RCU is not in an extended
> quiescent state, though its return value can be iffy in the middle of
> rcu_eqs_enter_common() -- which is why interrupts are disabled there.
> In preemptible RCU, you can (but shouldn't) use rcu_preempt_depth()
> to determine whether you are within an RCU read-side critical section,
> which is what in_rcu_critical_section() sounds like to me, but I don't
> see how this information would help in this situation.
>
> What am I missing here?
>
Would in_guts_of_internal_rcu_infrastructure_code() work? :-)
Here's the crucial part of that stack dump again:
save_stack_trace+0x1b/0x1d
check_stack+0xec/0x24a
stack_trace_call+0x40/0x53
0xffffffffa0026077
? ftrace_graph_caller+0x78/0xa8
? trace_hardirqs_off+0xd/0xf
? rcu_eqs_enter_common.constprop.71+0x5/0x108
rcu_eqs_enter_common.constprop.71+0x5/0x108
rcu_idle_enter+0x51/0x72
The stack trace was called on rcu_eqs_enter_common() inside the
rcu_idle_enter() function call.
Perhaps if I just let rcu disable stack tracing? Something like this:
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 50fee76..f894fc3 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -853,8 +853,10 @@ void rcu_idle_enter(void)
unsigned long flags;
local_irq_save(flags);
+ disable_stack_tracer();
rcu_eqs_enter(false);
rcu_sysidle_enter(0);
+ enable_stack_tracer();
local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(rcu_idle_enter);
-- Steve