Re: [PATCH] x86/hweight: Fix and improve __arch_hweight{32,64}() assembly
From: Ingo Molnar
Date: Mon Mar 10 2025 - 16:16:50 EST
* Uros Bizjak <ubizjak@xxxxxxxxx> wrote:
> a) Use ASM_CALL_CONSTRAINT to prevent inline asm that includes call
> instruction from being scheduled before the frame pointer gets set
> up by the containing function, causing objtool to print a "call
> without frame pointer save/setup" warning.
>
> b) Use asm_inline to instruct the compiler that the size of asm()
> is the minimum size of one instruction, ignoring how many instructions
> the compiler thinks it is. ALTERNATIVE macro that expands to several
> pseudo directives causes instruction length estimate to count
> more than 20 instructions.
>
> c) Use named operands in inline asm.
>
> More inlining causes slight increase in the code size:
>
> text data bss dec hex filename
> 27261832 4640296 814660 32716788 1f337f4 vmlinux-new.o
> 27261222 4640320 814660 32716202 1f335aa vmlinux-old.o
What is the per call/inlining-instance change in code size, measured in
fast-path instruction bytes? Also, exception code or cold branches near
the epilogue of the function after the main RET don't fully count as a
size increase.
This kind of normalization and filtering of changes to relevant
generated instructions is a better metric than some rather meaningless
'+610 bytes of code' figure.
Also, please always specify the kind of config you used for building
the vmlinux.
Thanks,
Ingo