Re: [RFC 00/15] x86_64: Optimize percpu accesses

From: Jeremy Fitzhardinge
Date: Wed Jul 09 2008 - 16:34:01 EST


Christoph Lameter wrote:
Jeremy Fitzhardinge wrote:
Ingo Molnar wrote:
Note that the zero-based percpu problems are completely unrelated to
stackprotector. I was able to hit them with a stackprotector-disabled
gcc-4.2.3 environment.
The only reason we need to keep a zero-based pda is to support
stack-protector. If we drop drop it, we can drop the pda - and its
special zero-based properties - entirely.


Another reason to use a zero based per cpu area is to limit the offset range. Limiting the offset range allows in turn to limit the size of the generated instructions because it is part of the instruction.

No, it makes no difference. %gs:X always has a 32-bit offset in the instruction, regardless of how big X is:

mov %eax, %gs:0
mov %eax, %gs:0x1234567
->
0: 65 89 04 25 00 00 00 00 mov %eax,%gs:0x0
8: 65 89 04 25 67 45 23 01 mov %eax,%gs:0x1234567


It also is easier to handle since __per_cpu_start does not figure
in the calculation of the offsets.

No, you do it the same as i386. You set the segment base to be percpu_area-__per_cpu_start, and then just refer to %gs:per_cpu__foo directly. You can use rip-relative addressing to make it a smaller addressing mode too:

0: 65 89 05 00 00 00 00 mov %eax,%gs:0(%rip) # 0x7


J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/