[patch 2.1.97] more capabilities support

Andrew Morgan (morgan@transmeta.com)
Sun, 19 Apr 1998 03:17:54 -0700


Linus,

Appended is some more capability code. It:

* heads off the use of cap_t (which is defined in the POSIX.1e draft
as something else) and adds some more infrastructure to
<linux/capabilities.h> and a little code "clean"(?) up.

* adds some proc/<PID>/status lines to display process capabilities
(tested)

* adds some sys_*() calls [i386 unistd.h bindings only] for getting
and setting process capabilities (not yet tested).

FYI: There is some old 'libcap', library code for a user space API for
getting/setting capabilities in a POSIX-like way here:

http://linux.kernel.org/pub/linux/libs/security/linux-privs/capabilities

it was written a while back when we put together some working patches
for 2.0.32 (It contained some experimental ext2 filesystem support for
capabilities...) I will rearrange it to work with the "official"
kernel code.

Anyway, time for bed.

Cheers

Andrew

[PS: Linus, this patch is also in ~morgan/kernel/caps.patch ;^) ]

========8<===================
diff -urN linux/Documentation/capabilities.txt linux-agm/Documentation/capabilities.txt
--- linux/Documentation/capabilities.txt Wed Dec 31 16:00:00 1969
+++ linux-agm/Documentation/capabilities.txt Sat Apr 18 17:57:24 1998
@@ -0,0 +1,140 @@
+capabilities.txt Andrew G. Morgan
+ <morgan@transmeta.com>
+ 1998/4/18
+
+ 1 Kernel capabilities
+
+ The basic idea behind capabilities is that the superuser concept on
+ normal UNIX systems is too coarsely grained, i.e., you either have all
+ access to the system or no privileged access as an ordinary user.
+
+ With capabilities, the privilege normally given to the user with UID=0
+ is split into a number of independent capabilities. Each capability
+ is designed to give limited access to kernel-level privilege, eg., to
+ reconfigure a TTY or a network device, or to override read access on
+ a file that the reading user wouldn't normally be permitted to access.
+
+ 1.1 Definitions
+
+ Capabilities are stored as bit-sets. Currently these are 32 bits
+ each, but the kernel code has [WILL HAVE] the ability to truncate and
+ zero-fill as required to talk to other code that wants a different
+ size. [WARNING, THIS IS NOT YET TESTED.]
+
+ 1.1.1 Process capabilities
+
+ Each process has three sets of capabilities: Effective; Inheritable;
+ and Permitted.
+
+ 1.1.1.1 Inheritable
+
+ These capabilities may be inherited through exec(). In other words,
+ if a running process has capabilities "raised" in this set and then
+ exec()s another file, these capabilities are directly inherited by the
+ new executable. [THIS IS A COMMENT FOR WHEN FILE CAPS ARE
+ IMPLEMTNED :- However, this alone is not enough to actually grant
+ capabilities to the new executable: the executed file must have the
+ Inheritable capabilities set too.]
+
+ 1.1.1.2 Permitted
+
+ The capabilities that the process may raise in the Effective and
+ Inheritable sets.
+
+ 1.1.1.3 Effective
+
+ Subset of Permitted; used to determine what the process can actually
+ do.
+
+ 1.1.2 File capabilities
+
+ WARNING: there is currently no support for file capabilities
+
+ Each file has three sets of capabilities, when the file is exec()'d
+ these three capabilities combine with those of the exec()'ing process
+ to establish the newly exec()'d process' capabilities.
+
+ 1.1.2.1 Inheritable
+
+ This is a mask against which the Inheritable capabilities of the
+ exec()'ing process are filtered. Only capabilities in this set may be
+ inherited by the exec()'d process.
+
+ Note, on filesystems that do not support capabilities, all executables
+ are assumed to have { ~0 } Inheritable capabilities.
+
+ 1.1.2.2 Effective
+
+ This set must have either all capabilities raised or all capabilities
+ lowered. It is used to indicate whether the program knows about
+ capabilities. If this set has capabilities raised, the program will
+ start with all of its Permitted capabilities in its Effective set. If
+ this set is empty, the program will start with an empty Effective set,
+ and will have to explicitly raise the capabilities it needs.
+
+ 1.1.2.3 Permitted
+
+ This set is the set of capabilities required by the executable in
+ order to do its job. These will appear in the Permitted set of the
+ process after exec()ing this executable.
+
+ 1.2 Capability inheritance
+
+ Following a sys_exec() call, the process' capability sets are modified
+ in the following way:
+
+ pI' = pI
+ pP' = fP | (fI & pI)
+ pE' = pP' & fE [NB. fE is 0 or ~0]
+
+ Key: I=Inheritable, P=Permitted, E=Effective // p=process, f=file
+ ' indicates post-exec().
+
+ Note, the Inheritable set is fully preserved across a sys_exec()
+ call.
+
+ Process capabilities are _not_ modified on fork (clone).
+
+ 1.3 Compatibility mode
+
+ The historical situation, of the superuser being omnipotent, is
+ partially preserved following a system call to sys_exec(). This
+ backward compatibility is preserved only in the case that
+ (issecure(SECURE_NOROOT) == 0).
+
+ 1.3.1 Details
+
+ Backward compatibility takes the following form:
+
+ 1 setuid-root binaries, automatically have their Effective
+ and Permitted capabilities raised to equal the Inheritable
+ capabilities of the process that calls sys_exec().
+
+ 2 setuid-NON-root binaries, automatically have their Effective
+ capabilities cleared. If the UID of the process calling
+ sys_exec() is 0, the Permitted capabilities are raised to
+ equal the Inheritable capabilities of the process that called
+ sys_exec(). Such processes become "capability smart".
+
+ 3 non-setuid binaries directly inherit the capabilities of the
+ process that sys_exec()s them.
+
+ 1.3.2 Discussion
+
+ Firstly, we note that this backward compatibility has only one effect
+ on systems that are not capability aware:
+
+ Point 2, of section 1.3.1, implies that such binaries
+ inherit the EUID of root, but none of root's kernel level
+ privileges.
+
+ Secondly, on a system that runs binaries which are capability aware,
+ without setting SECURE_NOROOT mode, it is possible to further restrict
+ the root account:
+
+ Point 1, of section 1.3.1, implies that a parent process
+ can reduce its Inheritable capabilities and thus prevent any
+ of its children from _ever_ acquiring all/any of root's kernel
+ level privileges. An immediate consequence of this is that it
+ becomes possible to implement escape proof sys_chroot() cells.
+
diff -urN linux/fs/exec.c linux-agm/fs/exec.c
--- linux/fs/exec.c Thu Apr 2 09:04:00 1998
+++ linux-agm/fs/exec.c Sun Apr 19 02:21:51 1998
@@ -622,7 +622,6 @@
* allowed set to maximum and the application to "capability
* smart").
*/
-
if (!issecure(SECURE_NOROOT)) {
if (bprm->e_uid == 0 || current->uid == 0)
cap_set_full(bprm->cap_inheritable);
@@ -630,24 +629,27 @@
cap_set_full(bprm->cap_effective);
}

- /* We use a conservative definition of suid for capabilities.
- * The process is suid if the permitted set is not a subset of
- * the current permitted set after the exec call.
- * new permitted set = forced | (allowed & inherited)
- * pP' = fP | (fI & pI)
- */
-
- if ((bprm->cap_permitted.cap |
- (current->cap_inheritable.cap &
- bprm->cap_inheritable.cap)) &
- ~current->cap_permitted.cap) {
- id_change = 1;
+ if (!id_change) {
+ /* Only if pP' is _not_ a subset of pP, do we consider there
+ * has been a capability related "change of id". In such
+ * cases, we need to check that the elevation of privilege
+ * does not go against other system constraints. The new
+ * Permitted set is defined below -- see (***). */
+
+ kernel_cap_t working =
+ cap_combine(bprm->cap_permitted,
+ cap_intersect(bprm->cap_inheritable,
+ current->cap_inheritable));
+ if (!cap_issubset(working, current->cap_permitted)) {
+ id_change = 1;
+ }
}

if (id_change) {
- /* We can't suid-execute if we're sharing parts of the executable */
- /* or if we're being traced (or if suid execs are not allowed) */
- /* (current->mm->count > 1 is ok, as we'll get a new mm anyway) */
+ /* We can't suid-execute if we're sharing parts of the executable
+ * or if we're being traced (or if suid execs are not allowed)
+ * (current->mm->count > 1 is ok, as we'll get a new mm anyway)
+ */
if (IS_NOSUID(inode)
|| (current->flags & PF_PTRACED)
|| (current->fs->count > 1)
@@ -669,7 +671,7 @@
* The formula used for evolving capabilities is:
*
* pI' = pI
- * pP' = fP | (fI & pI)
+ * (***) pP' = fP | (fI & pI)
* pE' = pP' & fE [NB. fE is 0 or ~0]
*
* I=Inheritable, P=Permitted, E=Effective // p=process, f=file
diff -urN linux/fs/proc/array.c linux-agm/fs/proc/array.c
--- linux/fs/proc/array.c Wed Mar 11 15:53:18 1998
+++ linux-agm/fs/proc/array.c Sat Apr 18 15:55:02 1998
@@ -656,11 +656,17 @@
"Pid:\t%d\n"
"PPid:\t%d\n"
"Uid:\t%d\t%d\t%d\t%d\n"
- "Gid:\t%d\t%d\t%d\t%d\n",
+ "Gid:\t%d\t%d\t%d\t%d\n"
+ "CapEffective:\t%08x\n"
+ "CapInheritable:\t%08x\n"
+ "CapPermitted:\t%08x\n",
get_task_state(p),
p->pid, p->p_pptr->pid,
p->uid, p->euid, p->suid, p->fsuid,
- p->gid, p->egid, p->sgid, p->fsgid);
+ p->gid, p->egid, p->sgid, p->fsgid,
+ p->cap_effective.cap,
+ p->cap_inheritable.cap,
+ p->cap_permitted.cap);
return buffer;
}

diff -urN linux/include/asm-i386/unistd.h linux-agm/include/asm-i386/unistd.h
--- linux/include/asm-i386/unistd.h Tue Apr 7 08:05:05 1998
+++ linux-agm/include/asm-i386/unistd.h Sun Apr 19 00:40:26 1998
@@ -189,6 +189,8 @@
#define __NR_pwrite 181
#define __NR_chown 182
#define __NR_getcwd 183
+#define __NR_getproccap 184
+#define __NR_setproccap 185

/* user-visible error numbers are in the range -1 - -122: see <asm-i386/errno.h> */

diff -urN linux/include/linux/capability.h linux-agm/include/linux/capability.h
--- linux/include/linux/capability.h Tue Apr 14 13:00:11 1998
+++ linux-agm/include/linux/capability.h Sun Apr 19 02:06:18 1998
@@ -1,7 +1,7 @@
/*
* This is <linux/capability.h>
*
- * Andrew G. Morgan <morgan@parc.power.net>
+ * Andrew G. Morgan <morgan@transmeta.com>
* Alexander Kjeldaas <astor@guardian.no>
* with help from Aleph1, Roland Buresund and Andrew Main.
*/
@@ -17,20 +17,29 @@
kernel might be somewhat backwards compatible, but don't bet on
it. */

-#define _LINUX_CAPABILITY_VERSION 0x19980330
+#define _LINUX_CAPABILITY_VERSION 0x19980418
+
+/* XXX - Note, cap_t, is defined by POSIX to be an "opaque" pointer to
+ a set of three capability sets. The transposition of 3*the
+ following structure to such a composite is better handled in a user
+ library since the draft standard requires the use of malloc/free
+ etc.. */

typedef struct _user_cap_struct {
__u32 version;
- __u32 size;
+ __u32 size; /* number of bytes of capability bits */
__u8 cap[1];
-} *cap_t;
+} *cap_user_t;

#ifdef __KERNEL__

typedef struct kernel_cap_struct {
- int cap;
+ __u32 cap;
} kernel_cap_t;

+#define _USER_CAP_HEADER_SIZE (2*sizeof(__u32))
+#define _KERNEL_CAP_T_SIZE (sizeof(kernel_cap_t))
+
#endif


@@ -189,15 +198,43 @@
#define CAP_FULL_SET { ~0 }

#define CAP_TO_MASK(x) (1 << (x))
-#define cap_raise(c, flag) (c.cap |= CAP_TO_MASK(flag))
-#define cap_lower(c, flag) (c.cap &= ~CAP_TO_MASK(flag))
-#define cap_raised(c, flag) (c.cap & CAP_TO_MASK(flag))
-
-#define cap_isclear(c) (!c.cap)
-
-#define cap_copy(dest,src) do { (dest).cap = (src).cap; } while(0)
-#define cap_clear(c) do { c.cap = 0; } while(0)
-#define cap_set_full(c) do { c.cap = ~0; } while(0)
+#define cap_raise(c, flag) ((c).cap |= CAP_TO_MASK(flag))
+#define cap_lower(c, flag) ((c).cap &= ~CAP_TO_MASK(flag))
+#define cap_raised(c, flag) ((c).cap & CAP_TO_MASK(flag))
+
+static inline kernel_cap_t cap_combine(kernel_cap_t a, kernel_cap_t b)
+{
+ kernel_cap_t dest;
+ dest.cap = a.cap | b.cap;
+ return dest;
+}
+
+static inline kernel_cap_t cap_intersect(kernel_cap_t a, kernel_cap_t b)
+{
+ kernel_cap_t dest;
+ dest.cap = a.cap & b.cap;
+ return dest;
+}
+
+static inline kernel_cap_t cap_drop(kernel_cap_t a, kernel_cap_t drop)
+{
+ kernel_cap_t dest;
+ dest.cap = a.cap & ~drop.cap;
+ return dest;
+}
+
+static inline kernel_cap_t cap_invert(kernel_cap_t c)
+{
+ kernel_cap_t dest;
+ dest.cap = ~c.cap;
+ return dest;
+}
+
+#define cap_isclear(c) (!(c).cap)
+#define cap_issubset(a,set) (!((a).cap & ~(set).cap))
+
+#define cap_clear(c) do { (c).cap = 0; } while(0)
+#define cap_set_full(c) do { (c).cap = ~0; } while(0)

#define cap_is_fs_cap(c) ((c) & CAP_FS_MASK)

diff -urN linux/kernel/capability.c linux-agm/kernel/capability.c
--- linux/kernel/capability.c Wed Dec 31 16:00:00 1969
+++ linux-agm/kernel/capability.c Sun Apr 19 02:09:38 1998
@@ -0,0 +1,165 @@
+/*
+ * linux/kernel/capability.c
+ *
+ * Copyright (C) 1997 Andrew Main <zefram@fysh.org>
+ * Integrated into 2.1.97+ Andrew G. Morgan <morgan@transmeta.com>
+ */
+
+#include <linux/config.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/capability.h>
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <linux/string.h>
+
+/*
+ * User programs might have been compiled with a different idea of the number
+ * of capabilities. These functions provide a standard mechanism for
+ * translating between kernelspace and userspace capability structures.
+ * Bitsets are truncated if too big, and zero-filled if too small.
+ */
+
+static int cap_fromuser(kernel_cap_t *k, const cap_user_t u)
+{
+ size_t usize=0, rsize;
+
+ if (!access_ok(VERIFY_READ, u, _USER_CAP_HEADER_SIZE))
+ return -EINVAL;
+
+ /* XXX - we do not pay attention to the u->version */
+
+ copy_from_user(&usize, &(u->size), sizeof(u->size));
+ if (usize > _KERNEL_CAP_T_SIZE)
+ rsize = _KERNEL_CAP_T_SIZE;
+ else
+ rsize = usize;
+
+ if (!access_ok(VERIFY_READ, u->cap, rsize))
+ return -EINVAL;
+
+ copy_from_user(k, u->cap, rsize);
+ if (rsize < _KERNEL_CAP_T_SIZE)
+ memset(rsize + (__u8 *) k, 0, _KERNEL_CAP_T_SIZE - rsize);
+
+ return 0;
+}
+
+
+static int cap_touser(cap_user_t u, const kernel_cap_t *k, size_t usize)
+{
+ struct _user_cap_struct uheader;
+
+ if (!access_ok(VERIFY_WRITE, u, usize))
+ return -EINVAL;
+
+ if (usize < _USER_CAP_HEADER_SIZE)
+ return -EINVAL;
+
+ if (usize > (_USER_CAP_HEADER_SIZE+_KERNEL_CAP_T_SIZE))
+ usize = _KERNEL_CAP_T_SIZE;
+ else
+ usize -= _USER_CAP_HEADER_SIZE;
+
+ uheader.version = _LINUX_CAPABILITY_VERSION;
+ uheader.size = usize;
+ copy_to_user(u, &uheader, _USER_CAP_HEADER_SIZE);
+ copy_to_user(u->cap, k, usize);
+
+ return 0;
+}
+
+/*
+ * For sys_getproccap() and sys_setproccap(), any of the three
+ * capability set pointers may be NULL -- indicating that that set is
+ * uninteresting and/or not to be changed.
+ */
+
+asmlinkage int sys_getproccap(size_t usize, cap_user_t eset, cap_user_t iset,
+ cap_user_t pset)
+{
+ int error;
+
+ if (pset) {
+ if ((error = cap_touser(pset,&current->cap_permitted,usize)))
+ return error;
+ }
+
+ if (iset) {
+ if ((error = cap_touser(iset,&current->cap_inheritable,usize)))
+ return error;
+ }
+
+ if (eset) {
+ if ((error = cap_touser(eset,&current->cap_effective,usize)))
+ return error;
+ }
+
+ return 0;
+}
+
+/*
+ * The restrictions on setting capabilities are specified by POSIX:
+ *
+ * I: any raised capabilities must be a subset of the (old) Permitted
+ * P: permitted capabilities can only be removed and never added.
+ * E: must be set to a subset of (new) Permitted
+ */
+
+asmlinkage int sys_setproccap(size_t usize, const cap_user_t eset,
+ const cap_user_t iset, const cap_user_t pset)
+{
+ kernel_cap_t inheritable, permitted, effective;
+ int error;
+
+ /* copy from userspace */
+ if (eset) {
+ if ((error = cap_fromuser(&effective, eset, usize)))
+ return error;
+ }
+ if (iset) {
+ if ((error = cap_fromuser(&inheritable, iset, usize)))
+ return error;
+ }
+ if (pset) {
+ if ((error = cap_fromuser(&permitted, pset, usize)))
+ return error;
+ }
+
+ /* If we are changing the Inheritable set, the _newly-raised_
+ capabilities must be a subset of the _old_Permitted_ set. */
+
+ if (!iset) {
+ inheritable = current->cap_inheritable;
+ } else if (!cap_isclear(cap_drop(inheritable, cap_combine
+ (current->inheritable,
+ current->permitted)
+ ))) {
+ return -EPERM;
+ }
+
+ /* verify _new_Permitted_ is a subset of the _old_Permitted_ set */
+
+ if (!pset) {
+ permitted = current->cap_permitted;
+ } else if (!cap_issubset(permitted, current->cap_permitted)) {
+ return -EPERM;
+ }
+
+ /* verify the _new_Effective_ is a subset of the _new_Permitted_ */
+
+ if (!eset) {
+ effective = cap_intersect(current->cap_effective, permitted);
+ } else if (!cap_issubset(effective, permitted)) {
+ return -EPERM;
+ }
+
+ /* having verified that the proposed changes are legal,
+ we realize the new capabilities. */
+
+ current->cap_inheritable = inheritable;
+ current->cap_permitted = permitted;
+ current->cap_effective = effective;
+
+ return 0;
+}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu