Re: [PATCH 3/3] kmemtrace: Use tracepoints instead of markers.

From: Mathieu Desnoyers
Date: Mon Jan 05 2009 - 11:11:00 EST


* Eduard - Gabriel Munteanu (eduard.munteanu@xxxxxxxxxxx) wrote:
> On Fri, Jan 02, 2009 at 06:42:54PM -0500, Mathieu Desnoyers wrote:
> > Because whatever slab_buffer_size() does will be done on the fastpath of
> > the instrumented code *even when instrumentation is disabled*, and this
> > is something we need to avoid above all.
>
> I had doubts about this, so I tried it myself. It seems that when using
> -O2 it generates optimal code, without computing a << b unnecessarily.
> It only precomputes it at -O0. Here's how I tested...
>
> #include <stdio.h>
>
> #define unlikely(x) __builtin_expect(!!(x), 0)
>
> static int do_something_enabled;
>
> static void print_that(unsigned long num)
> {
> printf("input << 5 == %lu\n", num);
> }
>
> static inline void do_something(unsigned long num)
> {
> if (unlikely(do_something_enabled)) /* Like DEFINE_TRACE does. */
> print_that(num);
> }
>
> static void call_do_something(unsigned long in)
> {
> do_something(in << 5);
> }
>
> int main(int argc, char **argv)
> {
> unsigned long in;
>
> if (argc != 3) {
> printf("Wrong number of arguments!\n");
> return 0;
> }
>
> sscanf(argv[1], "%d", &do_something_enabled);
> sscanf(argv[2], "%lu", &in);
>
> call_do_something(in);
>
> return 0;
> }
>
>
> Snippet of objdump output when using -O2:
>
> static inline void do_something(unsigned long num)
> {
> if (unlikely(do_something_enabled))
> 400635: 85 c0 test %eax,%eax
> 400637: 74 be je 4005f7 <main+0x17>
> print_that():
> /home/edi/prj/src/inlineargs/inlineargs.c:9
>
> static int do_something_enabled;
>
> static void print_that(unsigned long num)
> {
> printf("input << 5 == %lu\n", num);
> 400639: 48 c1 e6 05 shl $0x5,%rsi
> 40063d: bf 6e 07 40 00 mov $0x40076e,%edi
> 400642: 31 c0 xor %eax,%eax
> 400644: e8 5f fe ff ff callq 4004a8 <printf@plt>
>
>
> Snippet of objdump output when using -O0:
>
> static void call_do_something(unsigned long in)
> {
> 4005fd: 55 push %rbp
> 4005fe: 48 89 e5 mov %rsp,%rbp
> 400601: 48 83 ec 10 sub $0x10,%rsp
> 400605: 48 89 7d f8 mov %rdi,-0x8(%rbp)
> /home/edi/prj/src/inlineargs/inlineargs.c:20
> do_something(in << 5);
> 400609: 48 8b 45 f8 mov -0x8(%rbp),%rax
> 40060d: 48 89 c7 mov %rax,%rdi
> 400610: 48 c1 e7 05 shl $0x5,%rdi
> 400614: e8 02 00 00 00 callq 40061b <do_something>
>
>
> Look at that shl, it indicates the left-shift (<< 5). In the first case it's
> deferred as much as possible. However, in the second case, it's done
> before calling that inline. Also confirmed with GCC using breakpoints on
> that shl.
>
> Can we take this as general behaviour, i.e. fn(a()), where fn() is inlined
> and a() has no side-effects, will only compute a() when needed, at least on
> GCC and when -O2 is in effect?
>
> It only seems natural to me GCC would do this on a regular basis.
>

Hopefully it does, especially when there are no side-effects. Can you
also try with -Os ?

Mathieu


>
> Cheers,
> Eduard
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/