Re: [GIT PULL] perf fixes

From: Steven Rostedt
Date: Fri Jun 22 2012 - 19:18:41 EST


On Fri, 2012-06-22 at 22:08 +0200, Hagen Paul Pfeifer wrote:
> Rephrase "do not help": it helps for the framepointer aspect, no
> doubt. So mfentry is superior in that aspect. But it still generates a
> function call.

It generates a function call that at boot up is converted to a 5 byte
nop. The output ends up being:

00000000000008c3 <schedule>:
8c3: 0f 1f 44 00 00 nop
8c8: 65 48 8b 04 25 00 00 mov %gs:0x0,%rax
8cf: 00 00
8cd: R_X86_64_32S current_task
8d1: 57 push %rdi
8d2: 48 8b 10 mov (%rax),%rdx
8d5: 48 85 d2 test %rdx,%rdx
8d8: 74 45 je 91f <schedule+0x5c>
8da: 48 83 b8 60 06 00 00 cmpq $0x0,0x660(%rax)
8e1: 00
8e2: 75 3b jne 91f <schedule+0x5c>

> Which in turn affects code generation.

How so? It's not a C function call (like the -finstrument-functions
produces). It's an assembly function call. The only differences between
having ftrace enabled and ftrace disabled with -mfentry is that you get
a 5 byte nop at the start of each traceable function. Sure, it might put
a little pressure on the icache, but from the benchmarks I've run, the
impact has all been within the noise.

I've been told that it doesn't even hurt the pipeline. But I've Cc'd hpa
and Arjan for their comments. How much impact does a 5 byte nop at the
start of each function really have on the normal operations of the
kernel?

> That was the second concern of Linus, regarding mcount.

Again, the only difference is those 5 bytes. There's no other code
generation difference that I know of.

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/