Re: [PATCH v2 4/8] objtool: add undwarf debuginfo generation

From: Ingo Molnar
Date: Tue Jul 11 2017 - 04:41:13 EST

* Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:

> Anyway, I used some linker magic to temporarily move the unwinder code to the
> end of .text, so that unwinder changes don't add unexpected side effects to the
> microbenchmark behavior. Now I'm getting more consistent results: the packed
> struct is measuring ~2% slower. The slight slowdown might just be explained by
> the fact that GCC generates some extra instructions for extracting the fields
> out of the packed struct.

Yeah, the 16-bit field accesses versus a zero-extended 32-bit field are more
complex to access even on x86 that has a fair amount of 16-bit legacy.

> In the meantime, I found a ~10% speedup by making the "fast lookup table" block
> size a power-of-two (256) to get rid of the need for a slow 'div' instruction.
> I think I'm done performance tweaking for now. I'll keep the packed struct, and
> add the code for the 'div' removal, and hope to submit v3 soon.

Sounds good to me!

~2% slowdown for ~30% RAM savings for a debug data structure that is about as
large as a typical kernel's total .text is a decent trade-off.