Re: [RFC 00/15] x86_64: Optimize percpu accesses
From: Eric W. Biederman
Date: Wed Jul 09 2008 - 20:02:17 EST
"H. Peter Anvin" <hpa@xxxxxxxxx> writes:
> Eric W. Biederman wrote:
>>>>
>>> CONFIG_PHYSICAL_START rather. And no, it can't be zero! Realistically we
>>> should make it 16 MB by default (currently 2 MB), to keep the DMA zone clear.
>>
>> Also on x86_64 CONFIG_PHYSICAL_START is irrelevant as the kernel text segment
>> is liked at a fixed address -2G and the option only determines the virtual
>> to physical address mapping.
>>
>
> No, it's not irrelevant; we currently base the kernel at virtual address -2 GB
> (KERNEL_IMAGE_START) + CONFIG_PHYSICAL_START, in order to have the proper
> alignment for large pages.
Ugh. That is silly. We need to restrict CONFIG_PHYSICAL_START to the aligned
choices obviously. But -2G is better aligned then anything else we can do virtually.
For the 32bit code we need to play some of those games because it doesn't have
it's own magic chunk of the address space to live in.
>> That said the idea may not be too far off.
>>
>> Potentially we could put the percpu area at our fixed -2G address and then
>> we have a constant (instead of an address) we could subtract from this
> address.
>
> We can't put it at -2 GB since the offset +40 for the stack sentinel is
> hard-coded into gcc. This leaves growing upward from +48 (or another small
> positive number), or growing down from zero (or +40) as realistic options.
I was thinking everything except that access would be done as:
%gs:var - -2G aka
%gs:var - START_KERNEL.
So that everything was a small 32bit number. That the linker and the compiler can
resolve. The trick is to put the stack canary at 40 decimal.
I was just trying to find a compile time know location for the start of the percpu
area so we could subtract it off.
Unless the linker just winds up overflowing in the subtraction and doing hideous
things to us. Although that should be pretty easy to spot and to test for at
build time.
-2G has the interesting distinction that we might get away with just dropping the
high bits.
> Unfortunately, GNU ld handles grow-down not at all.
Another alternative that almost fares better then a segment with
a base of zero is a base of -32K or so. Only trouble that would get us
manually managing the per cpu area size again.
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/