Re: [RFC] Virtualization steps

From: Serge E. Hallyn
Date: Thu Mar 30 2006 - 11:40:40 EST


Quoting Herbert Poetzl (herbert@xxxxxxxxxxxx):
> On Thu, Mar 30, 2006 at 08:32:24AM -0600, Serge E. Hallyn wrote:
> > Quoting Chris Wright (chrisw@xxxxxxxxxxxx):
> > > * David Lang (dlang@xxxxxxxxxxxxxxxxxx) wrote:
> > > > what if the people administering the container are different from the
> > > > people administering the host?
> > >
> > > Yes, I alluded to that.
> > >
> > > > in that case the people working in the container want to be able to
> > > > implement and change their own policy, and the people working on the host
> > > > don't want to have to implement changes to their main policy config (wtih
> > > > all the auditing that would be involved with it) every time a container
> > > > wants to change it's internal policy.
> > >
> > > *nod*
> > >
> > > > I can definantly see where a container aware policy on the master would be
> > > > useful, but I can also see where the ability to nest seperate policies
> > > > would be useful.
> > >
> > > This is all fine. The question is whether this is a policy management
> > > issue or a kernel infrastructure issue. So far, it's not clear that this
> > > really necessitates kernel infrastructure changes to support container
> > > aware policies to be loaded by physical host admin/owner or the virtual
> > > host admin. The place where it breaks down is if each virtual host
> > > wants not only to control its own policy, but also its security model.
> >
> > What do you define as 'policy', and how is it different from the
> > security model?
> >
> > > Then we are left with stacking modules or heavier isolation (as in Xen).
> >
> > Hmm, talking about 'container' in this sense is confusing, because we're
> > not yet clear on what a container is.
> >
> > So I'm trying to get a handle on what we really want to do.
> >
> > Talking about namespaces is tricky. For instance if I do
> > clone(CLONE_NEWNS), the new process is in a new fs namespace, but the
> > fs objects are still the same, so if it loads an LSM, then perhaps at
> > most the new process should only control mount activities in its own
> > namespace.
> >
> > Frankly I thought, and am still not unconvinced, that containers owned
> > by someone other than the system owner would/should never want to load
> > their own LSMs, so that this wasn't a problem. Isolation, as Chris has
> > mentioned, would be taken care of by the very nature of namespaces.
> >
> > There are of course two alternatives... First, we might want to
> > allow the machine admin to insert per-container/per-namespace LSMs.
> > To support this case, we would need a way for the admin to tag a
> > container some way identifying it as being subject to a particular set
> > of security_ops.
> >
> > Second, we might want container admins to insert LSMs. In addition
> > to a straightforward way of tagging subjects/objects with their
> > container, we'd need to implement at least permissions for "may insert
> > global LSM", "may insert container LSM", and "may not insert any LSM."
> > This might be sufficient if we trust userspace to always create full
> > containers. Otherwise we might want to support meta-policy along the
> > lines of "may authorize ptrace and mount hooks only", or even "not
> > subject to the global inode_permission hook, and may create its own."
>
> sorry folks, I don't think that we _ever_ want container
> root to be able to load any kernel modues at any time
> without having CAP_SYS_ADMIN or so, in which case the
> modules can be global as well ... otherwise we end up
> as a bad Xen imitation with a lot of security issues,
> where it should be a security enhancement ...

I agree. As Chris points out, at most we should help LSM become
container-aware. But as the selinux example shows, even that should
not be necessary.

And that's for funky setups. For normal setups, the isolation provided
inherently by the namespaces should suffice.

thanks,
-serge
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/