Re: [RFC v2 PATCH 3/8] fs: Treat foreign mounts as nosuid

From: Seth Forshee
Date: Thu May 05 2016 - 09:05:24 EST

On Wed, May 04, 2016 at 11:19:04PM +0000, Serge Hallyn wrote:
> Quoting Djalal Harouni (tixxdz@xxxxxxxxx):
> > If a process gets access to a mount from a different user
> > namespace, that process should not be able to take advantage of
> > setuid files or selinux entrypoints from that filesystem. Prevent
> > this by treating mounts from other mount namespaces and those not
> > owned by current_user_ns() or an ancestor as nosuid.
> >
> > This patch was just adapted from the original one that was written
> > by Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> >
> I'm not sure that this makes sense given what you're doing. In the
> case of Seth's set, a filesystem is mounted specifically (and privately)
> in a user namespace. We don't want for instance the initial user ns
> to find a link to a setuid-root exploit left in the container-mounted
> filesystem.
> But you are having a parent user namespace mount the fs so that its
> children can all access the fs, uid-shifted for convenience. Not
> allowing the child namespaces to make use of setuid-root does not
> seem applicable here.

Right, the problem addressed by this patch probably isn't relevant to
this sort of uid shifting.

But I think there's another problem that needs to be addressed.
bprm_fill_uid() still gets the ids for sxid files unshifted from the
inode. We already protect against sxid to any user not in
bprm->cred->user_ns, so it will just ignore the sxid instead of e.g.
suid as global root from the id shifted mount, which is good. What would
be wanted though is to use the shifted ids so that something like
suid-root ping in the container rootfs would work.