Re: [RFC] ps command race fix

From: Kyle Moffett
Date: Tue Aug 15 2006 - 22:18:52 EST

Next message: Christoph Lameter: "[MODSLAB 1/7] Extract allocpercpu from Slab"
Previous message: Andrew Morton: "Re: [PATCH] [2/3] Create call_usermodehelper_pipe()"
In reply to: Albert Cahalan: "Re: [RFC] ps command race fix"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Aug 13, 2006, at 16:08:20, Albert Cahalan wrote:

In general, process namespaces are useful for:

1. silly marketing (see Sun and FreeBSD)

2. the very obscure case of "root" account providers who are too clueless to use SE Linux or Xen

3. Kernel developers who need to test their kernel changes on multiple distributions can install those distributions unmodified in containers (requires process namespaces)

4. Server admins who want an absolutely unbreakable way (well, as close as one can get) to isolate virtual servers from each other and from the hardware. This also doesn't have the overhead of running 2 stacked memory allocators (Xen allocator and kernel allocator) as well as multiple copies of each kernel.

I don't think either case justifies the complexity. I am not looking forward to the demands that I support this mess in procps. I suspect I am not alone; soon people will be asking for support in pstools, gdb, fuser, killall... until every app which interacts with other processes will need hacks.

IMHO support for PID namespaces should/would/will be done without changing the existing /proc entries or underlying layout. For example, one compatible solution would be to add a new "pidspace=" mount option to procfs which specifies which pidspace will be represented. Another method (also compatible with the pidspace= mount option and enabling hierarchical pid spaces would be to represent child pidspaces in their parent pidspace as a single process, such that sending signals to said process will send signals to the "init" process in the child pidspace as though they came from the kernel (IE: They aren't blocked-by-default), then add a "pidspace" subdirectory that contains all of the proc entries for processes in that space. Example:

# mount -t proc proc /proc
# create_new_pidspace /bin/sleep 100000 &
[1] 1234
# ls /proc
...
...
1234
...
...
# ls /proc/1234/pidspace
1

There are obviously still problems with this scenario (what numbering schemes do pidspaces use and how is the pidspace= mount option interpreted?), but it's an example of a few ways to preserve ABI compatibility with existing userspace. I think the idea behind all of this is to make it possible to run 6 different unmodified distros on a single system (but you need a few extra tools in the parent "boot" namespace).

If the cost were only an #ifdef in the kernel, there would be no problem. Unfortunately, this is quite a hack in the kernel and it has far-reaching consequenses in user space.

I believe it's been stated that rule #1 for this stuff is preserving backwards compatibility and kernel<=>userspace ABI with every change. We might add new APIs (which then have to remain the same in future versions, barring outright showstopper bugs), but existing APIs will still work correctly.

Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Christoph Lameter: "[MODSLAB 1/7] Extract allocpercpu from Slab"
Previous message: Andrew Morton: "Re: [PATCH] [2/3] Create call_usermodehelper_pipe()"
In reply to: Albert Cahalan: "Re: [RFC] ps command race fix"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]