Re: [patch] uninline init_waitqueue_*() functions

From: Randy.Dunlap
Date: Wed Jul 05 2006 - 17:54:02 EST


On Wed, 5 Jul 2006 23:45:02 +0200 Ingo Molnar wrote:

>
> * Linus Torvalds <torvalds@xxxxxxxx> wrote:
>
> > On Wed, 5 Jul 2006, Ingo Molnar wrote:
> > >
> > > i had CONFIG_DEBUG_INFO (and UNWIND_INFO) disabled in all these build
> > > tests.
> >
> > Good, because I just verified: those two together will on their own
> > increase "text size" by about 17% for me.
> >
> > I still think Andrew is right: I don't see how an initializer that
> > should basically be three instructions can possibly be 35 bytes larger
> > than a function call that should be a minimum of two instructions
> > (argument setup in %eax and the actual call - and that's totally
> > ignoring the deleterious effects of a function call on register
> > liveness).
> >
> > The fact that with allnoconfig the kernel is _smaller_ (but, quite
> > franlkly, within the noise) with the inlined version would seem to
> > back up Andrews position that it really shouldn't matter.
>
> well, the allnoconfig thing is artificial (and the uninteresting) for a
> number of reasons:

hm, I'd have to say that allyesconfig is also artificial and the
savings numbers are somewhat uninteresting in that case too.


> - it has REGPARM disabled which penalizes function calls
>
> - it's UP and hence the inlining cost of init_wait_queue_head() is
> significantly smaller.
>
> - allnoconfig has smaller average function size - increasing the cost of
> uninlining
>
> > So I'm left wondering why it matters for you, and what triggers it.
> > Maybe there is some secondary issue that could show us an even more
> > interesting optimization (or some compiler behaviour that we should
> > try to encourage).
>
> yeah, i'd not want to skip over some interesting and still unexplained
> effect either, but 35 bytes isnt all that outlandish and from everything
> i've seen it's a real win. Here is an actual example:
>
> c0fb6137: c7 44 24 08 00 00 00 movl $0x0,0x8(%esp)
> c0fb613e: 00
> c0fb613f: c7 44 24 08 01 00 00 movl $0x1,0x8(%esp)
> c0fb6146: 00
> c0fb6147: c7 43 60 00 00 00 00 movl $0x0,0x60(%ebx)
> c0fb614e: 8b 44 24 08 mov 0x8(%esp),%eax
> c0fb6152: 89 43 5c mov %eax,0x5c(%ebx)
> c0fb6155: 8d 43 64 lea 0x64(%ebx),%eax
> c0fb6158: 89 40 04 mov %eax,0x4(%eax)
> c0fb615b: 89 43 64 mov %eax,0x64(%ebx)
>
> versus:
>
> c0fb070e: 8d 43 5c lea 0x5c(%ebx),%eax
> c0fb0711: e8 94 98 18 ff call c0139faa <init_waitqueue_head>
>
> so 39 bytes versus 8 bytes - 31 bytes saved. It's a similar win in other
> cases i checked too. (the only exception is for smaller functions that i
> mentioned before: where the parameters are not pre-calculated yet so
> there's no good integration for the function call. In that case it's
> break even, or in some cases a 3-4 bytes loss.)

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/