Re: [git pull] latency tracer

From: Ingo Molnar
Date: Sat Feb 09 2008 - 02:02:19 EST



* Ingo Molnar <mingo@xxxxxxx> wrote:

> It does include one very interesting new feature that deserves to be
> mentioned outside of the shortlog: 'dynamic ftrace' - which is a
> transparent kernel-image-patcher mechanism that lazily patches out
> mcount callsites from all functions that get executed. [if tracing is
> disabled] These patched out callsites are remembered, and are patched
> back in when tracing is enabled.
>
> This technique does not just accelelerate the "tracing disabled" case
> enormously, we were in fact unable to measure _any_ performance
> difference (within noise) between an mcount-enabled dyn-ftrace [but
> tracing-disabled] and a vanilla kernel (!), on modern CPUs.

hot off the presses: Steve has just finished doing a complete set of
lmbench and hbackbench runs (on 2.8GHz Xeons, dual socket). The
comparison results are at:

http://rostedt.homelinux.com/dyn-ftrace-lmbench/lmbench-table

the results show zero lmbench/hackbench overhead from dyn-ftrace (with
thousands of functions instrumented) apart of the usual lmbench jitter.

The core kernel results for example [these used to be rather sensitive
on mcount overhead]:

Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
ftrace-jm Linux 2.6.24 2783 0.65 0.74 4.74 6.96 5.28 0.76 4.29 290. 866. 2626
ftrace-no Linux 2.6.24 2784 0.65 0.74 4.75 7.14 5.31 0.76 3.92 282. 837. 2591
no-patche Linux 2.6.24 2786 0.64 0.71 4.86 7.10 5.34 0.78 3.89 283. 829. 2572
patched-n Linux 2.6.24 2784 0.64 0.71 4.72 6.98 5.35 0.78 3.94 284. 832. 2582

"ftrace-jm": dyn-ftrace, 2-byte JMPs patched in the mcount callsite
"ftrace-no": dyn-ftrace, 5-byte NOPs patched in (devel patch, not in
this pull request)
"no-patche": no patches [vanilla Linus -git kernel]
"patched-n": patched with the ftrace tree, all tracer options off

i can see an outlier in the "Create" results - that's noise i think
because the patched+options-off kernel is in essence the same as
vanilla. "File reread" is noisy and unreliable as ever. 16-task
ctx-switch results are noisy too (especially the 64k working set ones) -
this is a HT system hence the multi-threaded results are fundamentally
noisy due to sibling interaction.

The hackbench results look good too.

enabling full tracing of all the tens of thousands of kernel functions
makes things measurably slower - but that is expected. (filtered
patching of a user-selected list of function names is an upcoming
feature)

the full raw results are at:

http://rostedt.homelinux.com/dyn-ftrace-lmbench/

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/