Re: [RFC][patch 00/21] PID Virtualization: Overview and Patches

From: Hubertus Franke
Date: Thu Dec 15 2005 - 17:01:31 EST


On Thu, 2005-12-15 at 11:49 -0800, Gerrit Huizenga wrote:
> On Thu, 15 Dec 2005 09:35:57 EST, Hubertus Franke wrote:

> > PID Virtualization is based on the concept of a container.
> > The ultimate goal is to checkpoint/restart containers.
> >
> > The mechanism to start a container
> > is to 'echo "container_name" > /proc/container' which creates a new
> > container and associates the calling process with it. All subsequently
> > forked tasks then belong to that container.
> > There is a separate pid space associated with each container.
> > Only processes/task belonging to the same container "see" each other.
> > The exception is an implied default system container that has
> > a global view.
> >
> > The following patches accomplish 3 things:
> > 1) identify the locations at the user/kernel boundary where pids and
> > related ids ( pgrp, sessionids, .. ) need to be (de-)virtualized and
> > call appropriate (de-)virtualization functions.
> > 2) provide the virtualization implementation in these functions.
> > 3) implement a container object and a simple /proc interface to create one
> > 4) provide a per container /proc/fs
> >
> > -- Hubertus Franke (frankeh@xxxxxxxxxxxxxx)
> > -- Cedric Le Goater (clg@xxxxxxxxxx)
> > -- Serge E Hallyn (serue@xxxxxxxxxx)
> > -- Dave Hansen (haveblue@xxxxxxxxxx)
>
> I think this is actually quite interesting in a number of ways - it
> might actually be a way of cleanly addressing several current out
> of tree problems, several of which are indpendently (occasionally) striving
> for mainline adoption: vserver, openvz, cluster checkpoint/restart.

Indeed the entire set might be able to benefit wrt to pid
virtualization. I think we are quite open to embrace a larger set of
applications of pid virtualization.

> I think perhaps this could also be the basis for a CKRM "class"
> grouping as well. Rather than maintaining an independent class
> affiliation for tasks, why not have a class devolve (evolve?) into
> a "container" as described here. The container provides much of
> the same grouping capabilities as a class as far as I can see. The
> right information would be availble for scheduling and IO resource
> management. The memory component of CKRM is perhaps a bit tricky
> still, but an overall strategy (can I use that word here? ;-) might
> be to use these "containers" as the single intrinsic grouping mechanism
> for vserver, openvz, application checkpoint/restart, resource
> management, and possibly others?
>
> Opinions, especially from the CKRM folks? This might even be useful
> to the PAGG folks as a grouping mechanism, similar to their jobs or
> containers.
>
Not being to alien to the CKRM concept, yes there is some nice synergy
here. As well as to PAGG and SGI's jobs. CKRM provides resource
constraints and runtime enforcements based on some grouping of
processes. Similar to container, class membership is inherited (if
that's still the case from last time I looked at it) until explicitely
changed. Containers and in particular provide another dimension
namely the ability to constraint "visibility" of resources and objects,
in this particular case pids as the first resource used.

> "This patchset solves multiple problems".

> gerrit
>
--
Hubertus Franke <frankeh@xxxxxxxxxxxxxx>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/