Re: [PATCH 2/2] mountinfo: implement show_path for kernfs and cgroup
From: Serge E. Hallyn
Date: Tue Apr 19 2016 - 00:05:54 EST
Quoting Serge E. Hallyn (serge@xxxxxxxxxx):
> Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx):
> > "Serge E. Hallyn" <serge.hallyn@xxxxxxxxxx> writes:
> >
> > >> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> > >> index 671dc05..9a0d7b3 100644
> > >> --- a/kernel/cgroup.c
> > >> +++ b/kernel/cgroup.c
> > >> @@ -1593,6 +1593,40 @@ static int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask)
> > >> return 0;
> > >> }
> > >>
> > >> +static int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
> > >> + struct kernfs_root *kf_root)
> > >> +{
> > >> + int len = 0, ret = 0;
> > >> + char *buf = NULL;
> > >> + struct cgroup_namespace *ns = current->nsproxy->cgroup_ns;
> > >> + struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root);
> > >> + struct cgroup *ns_cgroup;
> > >> +
> > >> + mutex_lock(&cgroup_mutex);
> > >
> > > Hm, I can't grab the cgroup mutex here because I already have the
> > > namespace_sem. But that's required by cset_cgroup_from_root(). Can
> > > I just call that under rcu_read_lock() instead? (Not without
> > > changing the lockdep_assert_help()). Is there another way to get the
> > > info needed here?
> >
> > Do we need the current cgroup namespace information at all?
> >
> > Could we not get the relevant cgroup namespace from the mount of
> > cgroupfs?
>
> I don't think so. That was my first inclination. But at show_path()
> all we have is the vfsmunt->mnt_root. Since all cgroup namespaces
> for a hierarchy share the same dentry tree and superblock, there's
> no way to tell where the mount's namespace root is supposed to be.
>
> whether we did
>
> # enter new cgroup namespace rooted at cgroup /user.slice/user-1000.slice
> mount -t cgroup -o freezer freezer /mnt
>
> or
>
> mount --bind /sys/fs/cgroup/freezer/user.slice/user-1000.slice /mnt
>
> the mountinfo entry will be the same.
>
> > In general the better path is not to have the contents of files depend on
> > who is reading the file.
And actually, while as i said above this was my first inclination, I now
think that's wrong. /proc/$$/cgroup is virtualized per the reader. The
point of this patch is to make mountinfo virtualized analogously to
/proc/$$/cgroup, so that we can be certain how a particular cgroup dentry
relates to a task's actual cgroup. So the mountinfo dentry root path
should in fact depend on the reader.
Looking at it another way... The value we're talking about shows us
the path of the root dentry of a cgroup mount. If a task in cgns2
rooted at /a/b/c mounts a cgroupfs, it will see '/' as the root dentry.
If a task in cgns1 rooted at /a/b looks at that mountinfo, '/' would
be misleading. It really should be '/c'.
If there were security implications those might override this. But there
is no security benefit to this. (The usual security argument is about
the opener vs the reader, not the mounter verses the reader, but in either
case I maintain there is no security benefit to virtualizing these paths)