* Jeremy Fitzhardinge <jeremy@xxxxxxxx> wrote:
Brian Gerst wrote:
Merge the 32-bit and 64-bit versions of these functions. Unlike 32-bit,Yeah, it's a real pity we can't convince the linker to do this simple computation as a single %gs:ADDR addressing mode. On the other hand, if the compiler can reuse the computation of %reg 2-3 times, then the generated code could well end up being denser.
the segment base is the current cpu's PDA instead of the offset from the
original per-cpu area. This is because GCC hardcodes the stackprotector
canary at %gs:40. Since the assembler is incapable of relocating against
multiple symbols, the code ends up looking like:
movq $per_cpu__var, reg
subq $per_cpu__pda, reg
movq %gs:(reg), reg
This is still atomic since the offset is a constant (just calculated at
runtime) and not dependant on the cpu number.
There's a nice project for linker hackers?
I'd like to see some kernel image size measurements done on x86 defconfig to see how much real impact this has on code density. Unless the impact is horribly unacceptable, removing ~200 lines of weird x86-specific APIs is a definitive plus.