Re: [PATCHv4 5/6] Allow setting O_NONBLOCK flag for new sockets
From: H. Peter Anvin
Date: Mon Nov 26 2007 - 21:39:35 EST
Linus Torvalds wrote:
The 6-word limit is a red herring. There is at least two ways to deal with it
(and this doesn't mean wiping the legacy stuff we already have):
- Let each architecture pick a calling convention and redefine the
architecture-independent bits to take an arbitrary number of arguments. This
is a one-time panarchitectural change.
Not applicable on x86-32.
The six-word limit is effectively a hardware limit there. Once it goes
past that limit, one of the words needs to be a pointer to extended
information that is fundamentally slower to access. Happily, only very
rare system calls do that (and none of them are of the simple variety
where we see a few cycles easily).
On other architectures, we could more easily just use more registers. But
x86-32 is still a big part (bulk) of what matters for most people.
Well, x86-32 and x86-64 are surprisingly similar here, for very
different reasons (x86-64 is because there are only seven clobbered
registers that aren't destroyed by the syscall instruction itself.)
However, on both of these we could make the user-space side cheaper, by
making sure that we don't have to do additional copies in user space.
For both these architectures, anything more than 3 parameters (i386) or
6 parameters (x86-64) will be already in memory on the stack, so if we
can use that image as-is then we at least save the intra-user-space copy
that goes along with it.
x86-64 requires some minor thought, since the obvious way of doing it
(using arg register 6 to push in a pointer) would end up with a
discontiguous frame. One can do it with something like this, although
it's not clear to me it is a win at all (the more obvious sequence using
XCHG isn't usable since XCHG locks unconditionally):
pop %r10 # Return address
push %r9 # Argument 6
movq %rsp, %r11
push %r10
movq %rcx, %r10
syscall
cmpq $-4095, %rax
jae ...
pop %r10
pop %r9
push %r10
retq
The number of registers do vary, obviously, with s390 being the smallest
number (5).
Immediately when you do anything but registers, it is much *much* more
costly. The "get_user()" and "copy_from_user()" stuff is not exactly slow,
but it's quite noticeable overhead for simple system calls. It gets worse
if this all is described by some indirect table setup.
True, of course, although we're talking here about different ways to
pull arguments out of userspace memory; *definitely* agreed with that we
don't want to have any additional indirection necessary.
-hpa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/