On Wed, Feb 06, 2002 at 01:00:12PM -0800, H. Peter Anvin wrote:
> Uhm... none of that "expensive optimization" will help you here,
> because you need *architectural storage*. Archtectural storage is
> always a fundamentally limited resource, regardless of how many shadow
> registers you add.
>
> However, an x86 has a whole additional register file which is rarely
> used these days -- the segment register file. The segment register
> file is mainly useful when the main usage is address offsetting, since
> it provides an extra input into the address adder. For this
> particular purpose, using it is very much the sane thing to do.
>
> As far as %gs switching is concerned, using %gs doesn't automatically
> mean using the LDT. There are two ways you can avoid setting up LDTs
> in single-threaded apps, and still allow an ABI compatible with the
> threaded apps:
>
> a) For single-threaded apps, define %gs == %ds. Less than ideal for
> several reasons, but no kernel mods needed.
This is not possible, since then %gs:0 (which is TLS base) cannot be read.
We would have to change the TLS ABI (thus become incompatible e.g. with Sun)
- note that these sequences are emitted by compiler or linker (the latter in
code transformations) and pick some hardcoded constant, say %gs:0x1000.
This would mean though that every single non-threaded app would need to
mmap a page at 4KB. If this offset was not constant, it would slow down all
TLS accesses.
> b) Have the kernel provide another GDT value which can be used by the
> single-threaded apps.
Like above, a fixed address for mmap would have to be chosen, but the
advantage would be that the TLS ABI would need no changing.
Simply kernel would add 0x33 to GDT as 4KB -> 0xc0000000 user data segment
and apps could put that value into %gs if not using threads.
But I think there is c) and d).
c) is just minor modification of current ldt handling in kernel, which would
mean a single entry LDT (residing in task_struct) could be used instead
of vmalloced one - this has the disadvantage of lldt on almost every
context switch
d) default to a single-entry per-cpu LDT, which only non-linux personality
apps and apps needing more than 1 LDT entry (threaded apps, wine/dosemu/...)
would not use. Non-linux apps would use current default_ldt and those
needing more than one LDT would use the current vmalloced mm private area.
If a task would be using this per-cpu LDT (common case), context switch
would do lldt only if previous task was not using the per-cpu LDT
(unlikely) and just store task_struct->thread.ldt_word_0 and ldt_word_1
into the per-cpu LDT (dunno how expensive is that, but IMHO it should be
cheaper than full load_LDT).
Jakub
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Thu Feb 07 2002 - 21:00:54 EST