Re: [PATCH v4 1/1] fs: Allow no_new_privs tasks to call chroot(2)

From: Kees Cook
Date: Tue Mar 16 2021 - 15:25:31 EST


On Tue, Mar 16, 2021 at 08:04:09PM +0100, Jann Horn wrote:
> On Tue, Mar 16, 2021 at 6:02 PM Mickaël Salaün <mic@xxxxxxxxxxx> wrote:
> > One could argue that chroot(2) is useless without a properly populated
> > root hierarchy (i.e. without /dev and /proc). However, there are
> > multiple use cases that don't require the chrooting process to create
> > file hierarchies with special files nor mount points, e.g.:
> > * A process sandboxing itself, once all its libraries are loaded, may
> > not need files other than regular files, or even no file at all.
> > * Some pre-populated root hierarchies could be used to chroot into,
> > provided for instance by development environments or tailored
> > distributions.
> > * Processes executed in a chroot may not require access to these special
> > files (e.g. with minimal runtimes, or by emulating some special files
> > with a LD_PRELOADed library or seccomp).
> >
> > Unprivileged chroot is especially interesting for userspace developers
> > wishing to harden their applications. For instance, chroot(2) and Yama
> > enable to build a capability-based security (i.e. remove filesystem
> > ambient accesses) by calling chroot/chdir with an empty directory and
> > accessing data through dedicated file descriptors obtained with
> > openat2(2) and RESOLVE_BENEATH/RESOLVE_IN_ROOT/RESOLVE_NO_MAGICLINKS.
>
> I don't entirely understand. Are you writing this with the assumption
> that a future change will make it possible to set these RESOLVE flags
> process-wide, or something like that?

I thought it meant "open all out-of-chroot dirs as fds using RESOLVE_...
flags then chroot". As in, there's no way to then escape "up" for the
old opens, and the new opens stay in the chroot.

> [...]
> > diff --git a/fs/open.c b/fs/open.c
> [...]
> > +static inline int current_chroot_allowed(void)
> > +{
> > + /*
> > + * Changing the root directory for the calling task (and its future
> > + * children) requires that this task has CAP_SYS_CHROOT in its
> > + * namespace, or be running with no_new_privs and not sharing its
> > + * fs_struct and not escaping its current root (cf. create_user_ns()).
> > + * As for seccomp, checking no_new_privs avoids scenarios where
> > + * unprivileged tasks can affect the behavior of privileged children.
> > + */
> > + if (task_no_new_privs(current) && current->fs->users == 1 &&
>
> this read of current->fs->users should be using READ_ONCE()

Ah yeah, good call. I should remember this when I think "can this race?"
:P

--
Kees Cook