Re: sched_setaffinity usability

From: Ingo Molnar
Date: Thu Mar 18 2004 - 16:49:35 EST



* Davide Libenzi <davidel@xxxxxxxxxxxxxxx> wrote:

> > Right now the VDSO mostly contains code and exception-handling data, but
> > it could contain real, userspace-visible data just as much: info that is
> > only known during the kernel build. There's basically no cost in adding
> > more fields to the VDSO, and it seems to be superior to any of the other
> > approaches. Is there any reason not to do it?
>
> With /proc/something you can have a single piece of code for all archs
> that exports NR_CPUS. The VDSO should be added to all missing archs.
> IMO performance is not an issue in getting NR_CPUS from userspace.

you just cannot beat the mapping performance of a near-zero-overhead
(V)DSO. No copying. No syscalls to set it up. No runtime dependencies on
having some filesystem mounted in the right spot. Already existing
framework to handle various API issues. Debuggers know the layout.

glibc could in theory boot-time assemble a /etc/vdso.so file and
open()/mmap()/close() it and then pagefault it in, which would be
roughly +10% to the cost of an exec(). I find it hard to accept that if
the best access method to this information by glibc is a DSO, and that
the source of the information is the kernel and only the kernel, that
glibc has to resort to some inferior method to access this information.
[not to mention the practical problem of readonly or remote /etc, so one
would have to mount ramfs, and mount /proc to construct /ram/vdso.so.
Also, nothing runtime-critical can thus be put into the vdso.]

it could also be in /boot/modules/$ver/vdso.so, but this detaches the
vdso from the kernel, breaking the single-image kernel concept (which
concept is quite useful). It also forces glibc to do the uname() syscall
to get to the kernel version in addition to the DSO mapping syscalls -
again an inferior method to access this always-needed DSO.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/