[PATCH 06/10] Add Documentation/namespaces/user_namespace.txt (v4)

From: Serge E. Hallyn
Date: Wed Oct 26 2011 - 16:32:49 EST

(Thanks for the feedback, David)

Provide a description of user namespaces

jul 26: incorporate feedback from David Howells.
jul 29: incorporate feedback from Randy Dunlap.
sep 15: remove information which is not yet certain.
Oct 26: add changes suggested by David on Oct 19

Signed-off-by: Serge E. Hallyn <serge.hallyn@xxxxxxxxxxxxx>
Cc: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
Cc: David Howells <dhowells@xxxxxxxxxx>
Cc: Randy Dunlap <rdunlap@xxxxxxxxxxxx>
Documentation/namespaces/user_namespace.txt | 54 +++++++++++++++++++++++++++
1 files changed, 54 insertions(+), 0 deletions(-)
create mode 100644 Documentation/namespaces/user_namespace.txt

diff --git a/Documentation/namespaces/user_namespace.txt b/Documentation/namespaces/user_namespace.txt
new file mode 100644
index 0000000..805f0ef
--- /dev/null
+++ b/Documentation/namespaces/user_namespace.txt
@@ -0,0 +1,54 @@
+Traditionally, each task is owned by a user ID (UID) and belongs to one or more
+groups (GID). Both are simple numeric IDs, though userspace usually translates
+them to names. The user namespace allows tasks to have different views of the
+UIDs and GIDs associated with tasks and other resources. (See 'UID mapping'
+below for more.)
+The user namespace is a simple hierarchical one. The system starts with all
+tasks belonging to the initial user namespace. A task creates a new user
+namespace by passing the CLONE_NEWUSER flag to clone(2). This requires the
+creating task to have the CAP_SETUID, CAP_SETGID, and CAP_CHOWN capabilities,
+but it does not need to be running as root. The clone(2) call will result in a
+new task which to itself appears to be running as UID and GID 0, but to its
+creator seems to have the creator's credentials.
+To this new task, any resource belonging to the initial user namespace will
+appear to belong to user and group 'nobody', which are UID and GID -1.
+Permission to open such files will be granted according to world access
+permissions. UID comparisons and group membership checks will fail, and
+privilege will be denied.
+When a task belonging to (for example) UID 500 in the initial user namespace
+creates a new user namespace, even though the new task will see itself as
+belonging to UID 0, any task in the initial user namespace will see it as
+belonging to UID 500. Therefore, UID 500 in the initial user namespace will be
+able to kill the new task.
+UID mapping for the VFS is not yet implemented, though prototypes exist.
+Relationship between the User namespace and other namespaces
+Other namespaces, such as UTS and network, are owned by a user namespace. When
+such a namespace is created, it is assigned to the user namespace of the task
+by which it was created. Therefore attempts to exercise privilege over a
+resource can be properly validated by checking for privilege targeted to the
+user namespace which owns the resource. For instance, a check for the special
+privilege to change a network interface address could be done by checking for
+CAP_NET_ADMIN against the user namespace which created the network namespace
+owning the network interface.
+( XXX TODO: add a list of capabilities corresponding to different namespaces)
+As an example, if a new task is cloned with a private user namespace but
+not a private network namespace, then the task's network namespace is owned
+by the parent user namespace. The new task has no special privilege over the
+parent user namespace, so it will not be able to create or configure the
+network devices therein. If instead the task were cloned with both private
+user and network namespaces, then the private network namespace is owned
+by the private user namespace, and so root in the new user namespace
+will have privilege over resources owned by the network namespace. It will
+be able to create and configure network devices.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/