Re: RCU stall when using function_graph

From: Steven Rostedt
Date: Wed Aug 02 2017 - 09:07:53 EST


On Wed, 2 Aug 2017 14:42:39 +0200
Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote:

> On Tue, Aug 01, 2017 at 08:12:14PM -0400, Steven Rostedt wrote:
> > On Wed, 2 Aug 2017 00:15:44 +0200
> > Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote:
> >
> > > On 02/08/2017 00:04, Paul E. McKenney wrote:
> > > >> Hi Paul,
> > > >>
> > > >> I have been trying to set the function_graph tracer for ftrace and each time I
> > > >> get a CPU stall.
> > > >>
> > > >> How to reproduce:
> > > >> -----------------
> > > >>
> > > >> echo function_graph > /sys/kernel/debug/tracing/current_tracer
> > > >>
> > > >> This error appears with v4.13-rc3 and v4.12-rc6.
> >
> > Can you bisect this? It may be due to this commit:
> >
> > 0598e4f08 ("ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines")
>
> Hi Steve,
>
> I git bisected but each time the issue occured. I went through the different
> version down to v4.4 where the board was not fully supported and it ended up to
> have the same issue.
>
> Finally, I had the intuition it could be related to the wall time (there is no
> RTC clock with battery on the board and the wall time is Jan 1st, 1970).
>
> Setting up the with ntpdate solved the problem.
>
> Even if it is rarely the case to have the time not set, is it normal to have a
> RCU cpu stall ?
>
>

BTW, function_graph tracer is the most invasive of the tracers. It's 4x
slower than function tracer. I'm wondering if the tracer isn't the
cause, but just slows things down enough to cause a some other race
condition that triggers the bug.

-- Steve