Re: [PATCH V4 3/8] namespaces: expose ns instance serial numbers in proc

From: Richard Guy Briggs
Date: Sun Aug 24 2014 - 16:28:42 EST


On 14/08/24, Andy Lutomirski wrote:
> On Thu, Aug 21, 2014 at 6:58 PM, Richard Guy Briggs <rgb@xxxxxxxxxx> wrote:
> > On 14/08/21, Andy Lutomirski wrote:
> >> On Aug 20, 2014 8:12 PM, "Richard Guy Briggs" <rgb@xxxxxxxxxx> wrote:
> >> > Expose the namespace instace serial numbers in the proc filesystem at
> >> > /proc/<pid>/ns/<ns>_snum. The link text gives the serial number in hex.
> >>
> >> What's the use case?
> >>
> >> I understand the utility of giving unique numbers to the audit code,
> >> but I don't think this part is necessary for that, and I'd like to
> >> understand what else will use this before committing to a duplicative
> >> API like this.
> >
> > How does a container manager get those numbers? It could provoke a task
> > to cause an audit event that emits a NS_INFO message, or it could run a
> > task in that container to report its namespace serial numbers directly
> > from its /proc mount.
>
> Why does a container manager need them? Is there any reason that
> keeping them entirely contained within the audit system would be a
> problem?

The audit system is currently per-kernel. If a container is migrated
from one kernel to another, the first audit system is no longer able to
monitor or care about it. It is the container manager's scope that has
the capability to monitor and care about it.

This might be a good argument to augment the audit system as we
currently know it to be able to do this across kernels, but that isn't
currently the case.

> > The discussion in this thread touches on the use cases:
> > https://lkml.org/lkml/2014/4/22/662
> >
> >> Note that this API is thoroughly incompatible with CRIU. If we do
> >> this, someone will ask for a namespace number namespace, and that way
> >> lies madness.
> >
> > I had a very brief look at CRIU, but not enough to understand the issue.
> > Others have hinted at this problem.
> >
> > Do you have a suggestion of a different approach that would be
> > compatible with CRIU?
> >
> > I'd originally considered some sort of UUID that would be globally
> > unique, but that would be very hard to devise or guarantee, and besides,
> > namespaces aren't only used by containers and could be shared in other
> > ways. Tracking the usage and migration of namespaces should be the task
> > of an upper layer.
>
> CRIU wants to save the complete state of a namespace and then restore
> it. For that to work, any information exposed to things in the
> namespace *cannot* be globally unique or unique per boot, since CRIU
> needs to arrange for that information to match whatever it was when
> CRIU saved it.

So are you agreeing with Eric Biederman's idea that its proc inode
number should be initially assigned serially, but reserve the right to
be settable on a restore of a namespace from another host? What if that
inode number collides with an existing one?

Does CRIU have no lattitude at all to be able to track a new namespace
ID?

> Also, I think that code running in a namespace has no business even
> knowing a unique identity of that namespace from the perspective of
> the host.

Too late. There is already the namespace proc inode numbers. That
number is almost completely meaningless to the code running inside the
container/namespace.

> Here's a specific use case for *not* exposing this: Tor. Ideally, Tor
> clients would run in a namespace that does not know about any global
> identity. That means no IP addresses, but it also means no global
> namespace serial numbers.

Well, it already has an IP address (which might be masqueraded by the
host or another upstream router) and a namespace inode number.

I'm not aware of support for anonymous namespaces, let along anonymous
containers yet.

> --Andy

- RGB

--
Richard Guy Briggs <rbriggs@xxxxxxxxxx>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/