Re: [lxc-devel] [RFC PATCH 11/11] loop: Allow priveleged operations for root in the namespace which owns a device

From: Serge Hallyn
Date: Tue May 27 2014 - 09:17:41 EST


Quoting Michael H. Warfield (mhw@xxxxxxxxxxxx):
> On Tue, 2014-05-27 at 03:36 +0200, Serge E. Hallyn wrote:
> > Quoting Michael H. Warfield (mhw@xxxxxxxxxxxx):
> > > On Mon, 2014-05-26 at 11:16 +0200, Seth Forshee wrote:
> > > > On Fri, May 23, 2014 at 08:48:25AM +0300, Marian Marinov wrote:
> > > > > -----BEGIN PGP SIGNED MESSAGE-----
> > > > > Hash: SHA1
> > > > >
> > > > > One question about this patch.
> > > > >
> > > > > Why don't you use the devices cgroup check if the root user in that namespace is allowed to use this device?
> > > > >
> > > > > This way you can be sure that the root in that namespace can not access devices to which the host system did not gave
> > > > > him access to.
> > >
> > > > That might be possible, but I don't want to require something on the
> > > > host to whitelist the device for the container. Then loop would need to
> > > > automatically add the device to devices.allow, which doesn't seem
> > > > desirable to me. But I'm not entirely opposed to the idea if others
> > > > think this is a better way to go.
> > >
> > > I don't see any safe way to avoid it. The host has to be in control of
> > > what devices can and can not be accessed by the container.
>
> > Disagree. loop%d is meaningless until it is attached to a file. So
> > whether a container can use loop2 vs loop9 is meaningless. The point
> > of Seth's loopfs as I understood it is that the container simply gets a
> > unique (not visible to host or any other containers) set of loop devices
> > which it can attach to files which it owns. So long as the host can't
> > see the container's loop devices (i.e. so it unwittently mounts it when
> > looking for a particular UUID for /var), it won't get fooled by them.
>
> > So in this case *if* we can do it, a purely namespaced approach - meaning
> > that we restrict visibility of a particular loopdev to one container - is
> > perfect.
>
> And in that "*if" is a cloud that says "then a miracle occurs" and that
> miracle needs a lot more detail.

Naturally. Which is why as Seth says we'll need concrete code to discuss.
But the concept that a well implemented namespace which prevents addressing
a given resource in the first place would suffice is, I think, a well
accepted premise of security in linux. And in this case it is more
appropriate than trying to finagle it into the devices cgroup. Note that
Marian said "to check if root user in that namespace is allowed to use
this device." This first off does not address the concern of root on the
host being tricked by the contents of loop0 which happens to be legitimately
used by container N. In contrast, making it so that loop0 is only
addressable by container N, and not by the host, does.

Anyway I as reading the above as why don't we *base* the containerized loop
on devices cgroups. I object to that. Well, at least until we rule out
more elegant solutions. Of course I don't object to defense in depth.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/