Re: x86_64 Compiler Output Kernel Bloat v4.4
From: Jeff Merkey
Date: Mon Jan 18 2016 - 16:45:50 EST
On 1/18/16, Jeff Merkey <linux.mdb@xxxxxxxxx> wrote:
> Hi,
>
> I noticed that in the assembler output for the x86_64 builds almost
> every single function originating from C code has a nop instruction
> that prefaces the function call. I guess the concern with this is
> the wasted space issue as each one of these placeholders takes up a
> bunch of bytes at the head of each function. Is there a reason this
> assembler header is there in the first place to anyones knowledge?
> Since every single function just about is prefaced by this inert 5
> byte instruction it adds up to quite a bit of bloat in the size of the
> linux executable.
>
> 0xffffffffa073e010 0F1F440000 nop DWORD PTR [rax+rax]=0x0
>
> The intel assembler format shows the bytes that comprise each
> instruction. The GDB format does not. Both are provided.
>
> 0xffffffffa073e050 4155 push r13
> (0)> id mdb_watchdogs
> mdb|mdb_watchdogs:
> 0xffffffffa073e010 mdb_watchdogs: nopl 0x0(%rax,%rax,1)) <<
> 0xffffffffa073e015 mdb_watchdogs+0x5: push %rbp
> 0xffffffffa073e016 mdb_watchdogs+0x6: mov %rsp,%rbp
> 0xffffffffa073e019 mdb_watchdogs+0x9: callq 0xffffffff811337e0
> touch_softlockup_watchdog_sync
> 0xffffffffa073e01e mdb_watchdogs+0xe: callq 0xffffffff810f0ba0
> clocksource_touch_watchdog
> 0xffffffffa073e023 mdb_watchdogs+0x13: callq 0xffffffff810dea20
> rcu_cpu_stall_reset
> 0xffffffffa073e028 mdb_watchdogs+0x18: callq 0xffffffff811337c0
> touch_nmi_watchdog
> 0xffffffffa073e02d mdb_watchdogs+0x1d: pop %rbp
> 0xffffffffa073e02e mdb_watchdogs+0x1e: data16
> 0xffffffffa073e030 mdb_watchdogs+0x20: retq
> 0xffffffffa073e031 mdb_watchdogs+0x21: nopw %cs:0x0(%rax,%rax,1))
> mdb|mdb:
> 0xffffffffa073e040 mdb: nopl 0x0(%rax,%rax,1)) <<
> 0xffffffffa073e045 mdb+0x5: push %rbp
> 0xffffffffa073e046 mdb+0x6: mov %rsp,%rbp
> 0xffffffffa073e049 mdb+0x9: push %r15
> 0xffffffffa073e04b mdb+0xb: push %r14
> 0xffffffffa073e04d mdb+0xd: mov %rdi,%r14
> 0xffffffffa073e050 mdb+0x10: push %r13
> (0)> u mdb_watchdogs
> mdb|mdb_watchdogs:
> 0xffffffffa073e010 0F1F440000 nop DWORD PTR [rax+rax]=0x0 <<
> 0xffffffffa073e015 55 push rbp
> 0xffffffffa073e016 4889E5 mov rbp,rsp
> 0xffffffffa073e019 E8C2579FE0 call touch_softlockup_watchdog_sync
> 0xffffffffa073e01e E87D2B9BE0 call clocksource_touch_watchdog
> 0xffffffffa073e023 E8F8099AE0 call rcu_cpu_stall_reset
> 0xffffffffa073e028 E893579FE0 call touch_nmi_watchdog
> 0xffffffffa073e02d 5D pop rbp
> 0xffffffffa073e02e 6690 data16
> 0xffffffffa073e030 C3 ret
> 0xffffffffa073e031 6666666666662E0F1F840000000000 nop cs:WORD PTR
> [rax+rax]=0x0000
> mdb|mdb:
> 0xffffffffa073e040 0F1F440000 nop DWORD PTR [rax+rax]=0x0 <<
> 0xffffffffa073e045 55 push rbp
> 0xffffffffa073e046 4889E5 mov rbp,rsp
> 0xffffffffa073e049 4157 push r15
> 0xffffffffa073e04b 4156 push r14
> 0xffffffffa073e04d 4989FE mov r14,rdi
> 0xffffffffa073e050 4155 push r13
> (0)> g
>
> Jeff
>
I think xor eax,eax is a lot shorter and fewer bytes.
Jeff