Re: [lxc-devel] [RFC PATCH 00/11] Add support for devtmpfs in user namespaces
From: Greg Kroah-Hartman
Date: Fri May 16 2014 - 14:54:33 EST
On Fri, May 16, 2014 at 09:06:07AM -0500, Seth Forshee wrote:
> On Thu, May 15, 2014 at 09:35:32PM -0700, Greg Kroah-Hartman wrote:
> > On Fri, May 16, 2014 at 01:49:59AM +0000, Serge Hallyn wrote:
> > > > I think having to pick and choose what device nodes you want in a
> > > > container is a good thing. Becides, you would have to do the same thing
> > > > in the kernel anyway, what's wrong with userspace making the decision
> > > > here, especially as it knows exactly what it wants to do much more so
> > > > than the kernel ever can.
> > >
> > > For 'real' devices that sounds sensible. The thing about loop devices
> > > is that we simply want to allow a container to say "give me a loop
> > > device to use" and have it receive a unique loop device (or 3), without
> > > having to pre-assign them. I think that would be cleaner to do using
> > > a pseudofs and loop-control device, rather than having to have a
> > > daemon in userspace on the host farming those out in response to
> > > some, I don't know, dbus request?
> >
> > I agree that loop devices would be nice to have in a container, and that
> > the existing loop interface doesn't really lend itself to that. So
> > create a new type of thing that acts like a loop device in a container.
> > But don't try to mess with the whole driver core just for a single type
> > of device.
>
> No matter what I don't think we get out of this without driver core
> changes, whether this was done in loop or by creating something new.
> Not unless the whole thing is punted to userspace, anyway.
>
> The first problem is that many block device ioctls check for
> CAP_SYS_ADMIN. Most of these might not ever be used on loop devices, I'm
> not really sure. But loop does at minimum support partitions, and to get
> that functionality in an unprivileged container at least the block layer
> needs to know the namespace which has privileges for that device.
That's fine, you should have those permissions in a container if you
want to do something like that on a loop device, right?
> The second is that all block devices automatically appear in devtmpfs.
> The scenario I'm concerned about is that the host could unknowingly use
> a loop device exposed to a container, then the container could see data
> from the host.
I don't think that's a real issue, the host should know not to do that.
> So we either need a flag to tell the driver core not to create a node
> in devtmpfs, or we need a privileged manager in userspace to remove
> them (which kind of defeats the purpose). And it gets more complicated
> when partition block devs are mixed in, because they can be created
> without involvement from the driver - they would need to inherit the
> "no devtmpfs node" property from their parent, and if the driver uses
> a psuedo fs to create device nodes for userspace then it needs to be
> informed about the partitions too so it can create those nodes.
I don't think that will be needed. Root in a host can do whatever it
wants in the containers, so mixing up block devices is the least of the
issues involved :)
> So maybe we could get by without the privileged ioctls, as long as it
> was understood that unprivileged containers can't do partitioning. But I
> do think the devtmpfs problem would need to be addressed.
I don't think unpriviliged containers should be able to do partitioning.
An unpriviliged user can't do that, so why should a container be any
different?
thanks,
greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/