Re: [RFC v2 v2 1/1] ns: add binfmt_misc to the mount namespace

From: Jann Horn
Date: Wed Oct 03 2018 - 15:27:26 EST


On Wed, Oct 3, 2018 at 8:07 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
> Laurent Vivier <laurent@xxxxxxxxx> writes:
> > This patch allows to have a different binftm_misc configuration
> > in each container we mount binfmt_misc filesystem with mount namespace
> > enabled.
> >
> > A container started without the CLONE_NEWNS will use the host binfmt_misc
> > configuration, otherwise the container starts with an empty binfmt_misc
> > interpreters list.
> >
> > For instance, using "unshare" we can start a chroot of an another
> > architecture and configure the binfmt_misc interpreted without being root
> > to run the binaries in this chroot.
>
> A couple of things.
> As has already been mentioned on your previous version anything that
> comes through the filesystem interface needs to lookup up the local
> binfmt context not through current but through file->f_dentry->d_sb.
> AKA the superblock of the mounted filesystem.

Something else: bm_register_write() currently calls into open_exec(),
which uses the credentials of current. That's not really allowed in
this context - but so far, it's not a big deal because only
init-namespace root can reach this code. Before you expose this stuff
to unprivileged userspace, this needs to get fixed; perhaps by
wrapping the open_exec() call in override_creds(file->f_cred) and
revert_creds().