Re: [PATCH v3 5/7] fs: Treat foreign mounts as nosuid

From: Seth Forshee
Date: Thu Sep 17 2015 - 08:50:31 EST


On Wed, Sep 16, 2015 at 01:57:10PM -0700, Andy Lutomirski wrote:
> On Wed, Sep 16, 2015 at 1:02 PM, Seth Forshee
> <seth.forshee@xxxxxxxxxxxxx> wrote:
> > From: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> >
> > If a process gets access to a mount from a different user
> > namespace, that process should not be able to take advantage of
> > setuid files or selinux entrypoints from that filesystem. Prevent
> > this by treating mounts from other mount namespaces and those not
> > owned by current_user_ns() or an ancestor as nosuid.
> >
> > This will make it safer to allow more complex filesystems to be
> > mounted in non-root user namespaces.
> >
> > This does not remove the need for MNT_LOCK_NOSUID. The setuid,
> > setgid, and file capability bits can no longer be abused if code in
> > a user namespace were to clear nosuid on an untrusted filesystem,
> > but this patch, by itself, is insufficient to protect the system
> > from abuse of files that, when execed, would increase MAC privilege.
> >
> > As a more concrete explanation, any task that can manipulate a
> > vfsmount associated with a given user namespace already has
> > capabilities in that namespace and all of its descendents. If they
> > can cause a malicious setuid, setgid, or file-caps executable to
> > appear in that mount, then that executable will only allow them to
> > elevate privileges in exactly the set of namespaces in which they
> > are already privileges.
> >
> > On the other hand, if they can cause a malicious executable to
> > appear with a dangerous MAC label, running it could change the
> > caller's security context in a way that should not have been
> > possible, even inside the namespace in which the task is confined.
> >
> > As a hardening measure, this would have made CVE-2014-5207 much
> > more difficult to exploit.
> >
> > Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> > Signed-off-by: Seth Forshee <seth.forshee@xxxxxxxxxxxxx>
> > ---
> > fs/exec.c | 2 +-
> > fs/namespace.c | 13 +++++++++++++
> > include/linux/mount.h | 1 +
> > security/commoncap.c | 2 +-
> > security/selinux/hooks.c | 2 +-
> > 5 files changed, 17 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/exec.c b/fs/exec.c
> > index b06623a9347f..ea7311d72cc3 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -1295,7 +1295,7 @@ static void bprm_fill_uid(struct linux_binprm *bprm)
> > bprm->cred->euid = current_euid();
> > bprm->cred->egid = current_egid();
> >
> > - if (bprm->file->f_path.mnt->mnt_flags & MNT_NOSUID)
> > + if (!mnt_may_suid(bprm->file->f_path.mnt))
> > return;
> >
> > if (task_no_new_privs(current))
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index da70f7c4ece1..2101ce7b96ab 100644
> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> > @@ -3276,6 +3276,19 @@ found:
> > return visible;
> > }
> >
> > +bool mnt_may_suid(struct vfsmount *mnt)
> > +{
> > + /*
> > + * Foreign mounts (accessed via fchdir or through /proc
> > + * symlinks) are always treated as if they are nosuid. This
> > + * prevents namespaces from trusting potentially unsafe
> > + * suid/sgid bits, file caps, or security labels that originate
> > + * in other namespaces.
> > + */
> > + return !(mnt->mnt_flags & MNT_NOSUID) && check_mnt(real_mount(mnt)) &&
> > + in_userns(current_user_ns(), mnt->mnt_sb->s_user_ns);
>
> Is check_mnt correct here? If I read it correctly, this means that,
> if I just unshare my userns and do nothing else (and, in particular,
> don't unshare my mount namespace), then everything will have
> mnt_may_suid return false.

The condition in check_mnt is exactly the same as the condition that
check_mnt replaces. If mnt_may_suid returned true before you unshared
only your user namespace then it should also return true after unshare.
The mount ns is the same as it was before so check_mnt returns true, and
the new user namespace is a child of the previous one so in_userns also
returns true.

Seth
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/