Re: [GIT PULL] x86/cpu changes for v2.6.34

From: Steven Rostedt
Date: Mon Mar 01 2010 - 14:42:14 EST


On Mon, 2010-03-01 at 08:47 -0800, Linus Torvalds wrote:

> Both of you seemed to miss the fact that it's not cpu7 that is
> particularly slow. See the original email from me in this thread: the jump
> was at some random point:
>
> [ 0.245179] CPU 1 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8
> [ 0.265332] #2
> [ 0.353185] CPU 2 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8
> [ 0.373328] #3
> [ 2.193277] CPU 3 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8
> [ 2.213379] #4
>
> and the reason I grepped for "CPU 7" was that it's the _last_ CPU on this
> machine, so what I was grepping for was basically "how long did it take to
> bring up all CPU's".
>
> So that particular really bad case apparently happened for CPU#3, but the
> two other slow cases happened for CPU#4.
>
> Also, it seems to happen only about every fifth boot or so. Suggestions
> for something simple that can trace things like that?

As Frederic has said you can use 'ftrace=function_graph' on the kernel
command line. It will be initialized in early_initcall (which I believe
is before CPUs are set up. Then add a tracing_off() after the trouble
code. You can make the trace buffers bigger with the kernel command
line:

trace_buf_size=10000000

The above will make the trace buffer 10Meg per CPU. Unlike the
"buffer_size_kb" file, this number is in bytes, even though it will
round to the nearest page. (I probably should make this into kb, and
rename it to trace_buf_size_kb, and deprecate trace_buf_size).

Then you can cat out /debug/tracing/trace, and search for large
latencies in the timestamps.

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/