Re: Introspecting userns relationships to other namespaces?

From: Eric W. Biederman
Date: Wed Jul 06 2016 - 11:59:12 EST


"Serge E. Hallyn" <serge@xxxxxxxxxx> writes:

> On Wed, Jul 06, 2016 at 10:41:48AM +0200, Michael Kerrisk (man-pages) wrote:
>> [Rats! Doing now what I should have down to start with. Looping some
>> lists and CRIU and other possibly relevant people into this
>> conversation]
>>
>> Hi Eric,
>>
>> On 5 July 2016 at 23:47, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>> > "Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx> writes:
>> >
>> >> Hi Eric,
>> >>
>> >> I have a question. Is there any way currently to discover which
>> >> user namespace a particular nonuser namespace is governed by?
>> >> Maybe I am missing something, but there does not seem to be a
>> >> way to do this. Also, can one discover which userns is the
>> >> parent of a given userns? Again, I can't see a way to do this.
>> >>
>> >> The point here is introspecting so that a process might determine
>> >> what its capabilities are when operating on some resource governed
>> >> by a (nonuser) namespace.
>> >
>> > To the best of my knowledge that there is not an interface to get that
>> > information. It would be good to have such an interface for no other
>> > reason than the CRIU folks are going to need it at some point. I am a
>> > bit surprised they have not complained yet.
>
> I don't think they need it. They do in fact have what they need. Assume
> you have tasks T1, T2, T1_1 and T2_1; T1 and T2 are in init_user_ns; T1
> spawned T1_1 in a new userns; T2 spawned T2_1 which setns()d to T1_1's ns.
> There's some {handwave} uid mapping, does not matter.
>
> At restart, it doesn't matter which task originally created the new userns.
> criu knows T1_1 and T2_1 are in the same userns; it creates the userns, sets
> up the mapping, and T1_1 and T2_1 setns() to it.

Given that the simple cases are so easy it probably doesn't matter in
that sense.

However we now have the case where user namespaces own pid namespaces,
and uts namespaces, and network namespaces, and ipc namespaces, and
filesystems. Throw in some mount propagation and use of setns and
things could get confusing. It is something that will need to be
figured out if CRIU is going to properly checkpoint containers
containing containers containing containers containing containers.

Did I mention I like recursion?

>> > That said in a normal use scenario I don't think that information is
>> > needed.
>> >
>> > Do you have a particular use case besides checkpoint/restart where this
>> > is useful? That might help in coming up with a good userspace interface
>> > for this information.
>>
>> So, I spend a moderate amount of time working with people to introduce
>> them to the namespaces infrastructure, and one topic that comes up now
>> and this introspection/visualization tools. For example,
>> nowadays--thanks to the (bizarrely misnamed) NStgid and NSpid fields
>> in /proc/PID--it's possible to (and someone I was working with did)
>> write tools that introspect the PID namespace hierarchy to show all of
>> process's and their PIDs in the various namespace instance. It's a
>> natural enough thing to want to do, when confronted with the
>> complexity of the namespaces.
>>
>> Someone else then asked me a question that led me to wonder about
>> generally introspecting on the parental relationships between user
>> namespaces and the association of other namespaces types with user
>> namespaces. One use would be visualization, in order to understand the
>> running system. Another would be to answer the question I already
>> mentioned: what capability does process X have to perform operations
>> on a resource governed by namespace Y?
>
> I agree they'll probably want it, but if we want for a real need and
> use case we can do a better job of providing what's needed.

That two which is why I mentioned CRIU. But yeah it will probably take
a little while to get there.

Eric