Re: sched_setaffinity usability

From: Ingo Molnar
Date: Thu Mar 18 2004 - 13:24:25 EST



* Linus Torvalds <torvalds@xxxxxxxx> wrote:

> sysconf() is a user-level implementation issue, and so is something
> like "number of CPU's". Damn, the simplest way to do it is as a
> environment variable, for christ sake! Just make a magic environment
> variable called __SC_ARRAY, and make it be some kind of binary
> encoding if you worry about performance.

i am not arguing for any sysconf() support at all - it clearly belongs
into glibc. Just doing 'man sysconf' shows that it should be in
user-space. No argument about that.

But how about the original issue Ulrich raised: how does user-space
figure out the NR_CPUS value supported by the kernel? (not the current #
of CPUs, that can be figured out using /proc/cpuinfo)

one solution would be what you suggest: to build some sort of /etc/info
file that glibc can access, which file is build during the kernel build
and contains the necessary constants. One problem with this approach is
that a user could boot via any arbitrary kernel, how does glibc (or even
a supposed early-init info-setup mechanism) know what info belongs to
which kernel? Kernel version numbers are not required to be unique. A
single non-modular bzImage can be used to have a fully working
userspace. Right now the kernel and glibc is isolated pretty much and
this gives us flexibility.

an environment variable is a similar solution and has the same problem:
it has to be generated somehow from the kernel's info, just like the
info file. As such it breaks the single-image concept, and the kernel
image and the 'metadata' can get detached.

but there's a clean solution i believe, a convenient object that we
already generate during kernel builds: the VDSO. It's mapped by the
kernel during execve() [or, in current kernels, is inherited via the
kernel mappings], and thus becomes a pretty fast (zero-copy) method of
having a kernel-specific user-space dynamic object. It's structured in
the most convenient (and thus, fastest, and most portable) way for
glibc's purposes. ld.so recognizes and uses it. It cannot be
misconfigured by the user, it comes with the kernel. It is included in a
single bzImage kernel just as much as a kernel rpm.

Right now the VDSO mostly contains code and exception-handling data, but
it could contain real, userspace-visible data just as much: info that is
only known during the kernel build. There's basically no cost in adding
more fields to the VDSO, and it seems to be superior to any of the other
approaches. Is there any reason not to do it?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/