Re: [patch] PID namespaces

From: Dave Hansen
Date: Sun Nov 04 2007 - 15:12:46 EST


On Sun, 2007-11-04 at 11:38 +0100, Ingo Molnar wrote:
> I.e. keep the namespace functionality but use a modulo 1.000.000 base
> for the PIDs so that it all looks nicer to the user. Minimal visibility
> difference but maximum compatibility. (The resulting limits are
> reasonable: 1 million tasks per container and 4 million containers on a
> single 32-bit box.) We could still restrict cross-namespace API use but
> all the cases where a global PID is desirable would still all work. I
> might be missing something obvious though.

There is definitely a great deal of desire to have containers look as
much as possible like a normally functioning system. That includes
having an init process. Everything today depends on that init process
having a pretty specific pid. That's definitely one of the 0.1% of
things that isn't really shaped by the kernel, but it's a pretty
important one 0.1%. (Linux Vserver does this pid virtualization, but
_only_ for init, btw.)

We also need to consider the needs of a checkpoint/restart system. Most
of my interest in containers comes because of their isolation
properties. That isolation is what lets us pick a container up and move
it more easily across systems.

But, once we've moved the container, all of that "single, global kernel"
stuff goes out the window because it wasn't just one kernel making
decisions. Plus, those pids stop becoming just cookies that were issued
by one kernel and interpreted by one kernel.

-- Dave

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/