Re: [RFC PATCH] cgroup namespaces: add a 'nsroot=' mountinfo field

From: Aditya Kali
Date: Wed Apr 13 2016 - 19:31:33 EST


On Wed, Apr 13, 2016 at 12:01 PM, Serge E. Hallyn <serge@xxxxxxxxxx> wrote:
> Quoting Tejun Heo (tj@xxxxxxxxxx):
>> Hello, Serge.
>>
>> On Wed, Apr 13, 2016 at 01:46:39PM -0500, Serge E. Hallyn wrote:
>> > It's not a leak of any information we're trying to hide. I realize
>> > something like 8 years have passed, but I still basically go by the
>> > ksummit guidance that containers are ok but the kernel's first priority
>> > is to facilitate containers but not trick containers into thinking
>> > they're not containerized. So long as the container is properly set
>> > up, I don't think there's anything the workload could do with the
>> > nsroot= info other than *know* that it is in a ns cgroup.
>> >
>> > If we did change that guidance, there's a slew of proc info that we
>> > could better virtualize :)
>>
>> I see. I'm just wondering because the information here seems a bit
>> gratuituous. Isn't the only thing necessary telling whether the root
>> is bind mounted or namescoped? Wouldn't simple "nsroot" work for that
>> purpose?
>
> I don't think so - we could be in a cgroup namespace but still have
> access only to bind-mounted cgroups. So we need to compare the
> superblock dentry root field to the nsroot= value.

Umm, I don't think this is such a good idea. The main purpose of
cgroup namespace was to prevent this exposure of system cgroup
hierarchy that used to happen because of /proc/self/cgroup. Wouldn't
showing that information in /proc/self/mountinfo defeat the purpose?

> One practical problem I've found with cgroup namespaces is that there
> is no way to disambiguate between a cgroupfs mount which was done in
> a cgroup namespace, and a bind mount of a cgroupfs directory.

Thats actually by design, no? Namespaced apps should not know/care if
they are running inside namespace. If they can find it out today, its
just because of certain side-effects. I fear adding explicit "nsroot"
or something in /proc/self/mountinfo now becomes an API making it hard
to virtualize user-apps again.

--
Aditya