Re: [RFC][PATCH 1/5] tracing: Make sure RCU is watching before calling a stack trace

From: Steven Rostedt
Date: Wed May 17 2017 - 12:48:26 EST


On Fri, 12 May 2017 13:31:45 -0700
"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> On Fri, May 12, 2017 at 04:05:32PM -0400, Steven Rostedt wrote:
> > On Fri, 12 May 2017 11:50:03 -0700
> > "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > > On Fri, May 12, 2017 at 02:36:19PM -0400, Steven Rostedt wrote:
> > > > On Fri, 12 May 2017 11:25:35 -0700
> > > > "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > > On Fri, May 12, 2017 at 01:15:45PM -0400, Steven Rostedt wrote:
> > > > > > From: "Steven Rostedt (VMware)" <rostedt@xxxxxxxxxxx>
> > > > > >
> > > > > > As stack tracing now requires "rcu watching", force RCU to be watching when
> > > > > > recording a stack trace.
> > > > > >
> > > > > > Signed-off-by: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
> > > > >
> > > > > Assuming that you never get to __trace_stack() if in an NMI handler,
> > > > > this looks good to me!
> > > > >
> > > > > In contrast, if if __trace_stack() ever is called from an NMI handler,
> > > > > invoking rcu_irq_enter() can be fatal.
> > > >
> > > > Then someone may die.
> > > >
> > > > OK, what's the case of running this in nmi? How does perf do it?
> > >
> > > I have no idea. If it cannot happen, then it cannot happen and all
> > > is well, RCU is happy, and I am happy. ;-)
> > >
> > > > Do we just skip the check if it is in an nmi?
> > > >
> > > > if (!in_nmi()) {
> > > > if (unlikely(rcu_irq_enter_disabled()))
> > > > return;
> > > > rcu_irq_enter();
> > > > }
> > > >
> > > > __ftrace_trace_stack();
> > > >
> > > > if (!in_nmi())
> > > > rcu_irq_exit();
> > > >
> > > > ?
> > >
> > > If it -can- happen, bail out of the function without doing the
> >
> > Why?
> >
> > > __ftrace_trace_stack()? Or does that just cause other problems further
> > > down the road? Or BUG_ON(in_nmi())?
> >
> > Why?
> >
> > > But again if it cannot happen, no problem and no need for extra code.
> >
> > We can't call stack trace from nmi anymore? It calls rcu_read_lock()
> > which is why we need to make sure rcu is watching, otherwise lockdep
> > complains.
>
> Ah, finally got it! If we are in_nmi(), you are relying on the
> NMI handler's call to rcu_nmi_enter(), which works. The piece I was
> forgetting was that you also recently said in an unrelated LKML thread
> that all the functions called at the very beginings and ends of NMI
> handlers (which can see !in_nmi()) are marked notrace, so that should
> be covered as well.
>
> So never mind! (And thank you for the explanation.)

Is this an Acked-by?

-- Steve