Re: 2.6.33: ftrace triggers soft lockup

From: Steven Rostedt
Date: Fri Mar 05 2010 - 10:06:50 EST

On Fri, 2010-03-05 at 15:16 +0800, Américo Wang wrote:
> On Fri, Mar 5, 2010 at 12:14 PM, Américo Wang <xiyou.wangcong@xxxxxxxxx> wrote:
> > On Thu, Mar 4, 2010 at 9:54 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >> On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote:
> >>> I am not sure if this is ftrace's fault, but it is ftrace who triggers
> >>> the soft lockup. On my machine, it is pretty easy, just run:
> >>>
> >>> echo function_graph > current_tracer
> >>>

> >
> > I can't say that because I didn't try -rc6.
> >
> Sigh, 2.6.33-rc6 doesn't work, even 2.6.32 doesn't work...

So basically you are saying that the function_graph tracer, when enabled
has a high overhead? Well, unfortunately, that's expected.

The function_graph tracer traces the start and end of every function. It
uses the same mechanism as function tracer to trace the start of the
function (mcount), but to trace the exit of a function, in the enter of
the function it hijacks the return address and replaces it to call a
trampoline. This trampoline will do the trace and then jump back to the
original return address.

Doing this breaks branch prediction in the CPU, as the CPU uses call/ret
as part of its branch prediction analysis. So function graph tracing is
not just twice as slow as function tracing, it actually has a bigger
impact than that.

So my question to you is, have you seen the function graph perform
better with the same configs in previous kernels? Also, the function
graph makes other debugging (like lockdep) have a greater impact to
performance than they usually do.

Now some things you can do to help performance. One is not to trace
functions that are known to have a high hit rate. You can do this with
the set_ftrace_notrace file, or add "ftrace_notrace=func1,func2,func3"
to the command line where func1,func2,func3 are the functions you do not
want to trace. This just adds these by default to the set_ftrace_notrace
and can be removed at runtime.

The functions I commonly write to are:

echo '*spin_lock*' '*spin_unlock*' '*spin_try*' '*rcu_read*' > set_ftace_notrace

since these functions are hit quite intensively, by not tracing them it
helps a bit with performance.

-- Steve

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at