Re: [PATCH x86/mm 6/6] x86-64 ia32 ptrace get/putreg32 current task

From: H. Peter Anvin
Date: Thu Nov 29 2007 - 14:18:02 EST


Andi, do you happen to remember the details on this?

-hpa


Linus Torvalds wrote:

On Thu, 29 Nov 2007, H. Peter Anvin wrote:
Linus Torvalds wrote:
It is advantageous for user space to use the register the kernel typically
won't, in order to speed up system call entry/exit.
but I'm not seeing the reason for that one. Care to comment more? (Yes,
there is often a latency from segment reload to use, but the reload latency
for system call exit *should* be entirely covered by the cost of doing the
system call return itself, no?)
I do seem to recall that some processor implementations can load a NULL
segment faster than a non-NULL segment. This was significant enough that we
wanted to use %fs in x86-64 userspace, as opposed to the original ABI which
used %gs both in userspace and in the kernel.

Ahh, I think you may be right for some CPUs. The zero selector is indeed potentially faster to load, since it doesn't have to even bother looking at the GDT/LDT.

That said, I doubt it's very noticeable. I just ran tests on both an old P4 and on a more modern Core 2 machine, and for both of those the performance was identical between loading a NUL selector and loading it with a non-zero one.

But I could well imagine that it matters a few cycles on other CPU's. But from my testing, it definitely isn't noticeable, and I think the maintenance advantage of using the same segment setup would more than make up for the fact that maybe some odd CPU can see a difference.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/