Re: [PATCH RFC] user-namespaced file capabilities - now with more magic

From: Serge E. Hallyn
Date: Fri May 20 2016 - 19:32:27 EST


Quoting Mimi Zohar (zohar@xxxxxxxxxxxxxxxxxx):
> On Fri, 2016-05-20 at 14:59 -0500, Serge E. Hallyn wrote:
> > Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx):
> > > "Serge E. Hallyn" <serge@xxxxxxxxxx> writes:
> > >
> > > > Quoting Eric W. Biederman (ebiederm@xxxxxxxxxxxx):
> > > >> Mimi Zohar <zohar@xxxxxxxxxxxxxxxxxx> writes:
> > > >>
> > > >> > On Thu, 2016-05-19 at 22:40 -0500, Serge E. Hallyn wrote:
> > > >> >> Quoting Mimi Zohar (zohar@xxxxxxxxxxxxxxxxxx):
> > > >> >> > On Wed, 2016-05-18 at 16:57 -0500, Serge E. Hallyn wrote:
> > > >> >
> > > >> >> > > diff --git a/fs/xattr.c b/fs/xattr.c
> > > >> >> > > index 4861322..5c0e7ae 100644
> > > >> >> > > --- a/fs/xattr.c
> > > >> >> > > +++ b/fs/xattr.c
> > > >> >> > > @@ -94,11 +94,26 @@ int __vfs_setxattr_noperm(struct dentry *dentry, const char *name,
> > > >> >> > > {
> > > >> >> > > struct inode *inode = dentry->d_inode;
> > > >> >> > > int error = -EOPNOTSUPP;
> > > >> >> > > + void *wvalue = NULL;
> > > >> >> > > + size_t wsize = 0;
> > > >> >> > > int issec = !strncmp(name, XATTR_SECURITY_PREFIX,
> > > >> >> > > XATTR_SECURITY_PREFIX_LEN);
> > > >> >> > >
> > > >> >> > > - if (issec)
> > > >> >> > > + if (issec) {
> > > >> >> > > inode->i_flags &= ~S_NOSEC;
> > > >> >> > > + /* if root in a non-init user_ns tries to set
> > > >> >> > > + * security.capability, write a security.nscapability
> > > >> >> > > + * in its place */
> > > >> >> > > + if (!strcmp(name, "security.capability") &&
> > > >> >> > > + current_user_ns() != &init_user_ns) {
> > > >> >> > > + cap_setxattr_make_nscap(dentry, value, size, &wvalue, &wsize);
> > > >> >> > > + if (!wvalue)
> > > >> >> > > + return -EPERM;
> > > >> >> > > + value = wvalue;
> > > >> >> > > + size = wsize;
> > > >> >> > > + name = "security.nscapability";
> > > >> >> > > + }
> > > >> >> >
> > > >> >> > The call to capable_wrt_inode_uidgid() is hidden behind
> > > >> >> > cap_setxattr_make_nscap(). Does it make sense to call it here instead,
> > > >> >> > before the security.capability test? This would lay the foundation for
> > > >> >> > doing something similar for IMA.
> > > >> >>
> > > >> >> Might make sense to move that. Though looking at it with fresh eyes I wonder
> > > >> >> whether adding less code here at __vfs_setxattr_noperm(), i.e.
> > > >> >>
> > > >> >> if (!cap_setxattr_makenscap(dentry, &value, &size, &name))
> > > >> >> return -EPERM;
> > > >> >>
> > > >> >> would be cleaner.
> > > >> >
> > > >> > Yes, it would be cleaner, but I'm suggesting you do all the hard work
> > > >> > making it generic. Then the rest of us can follow your lead. Its more
> > > >> > likely that you'll get it right. At a high level, it might look like:
> > > >> >
> > > >> > /* Permit root in a non-init user_ns to modify the security
> > > >> > * namespace xattr equivalents (eg. nscapability, ns_ima, etc).
> > > >> > */
> > > >> > if ((current_user_ns() != &init_user_ns) &&
> > > >> > capable_wrt_inode_uidgid(inode, CAP_SETFCAP)) {
> > > >> >
> > > >> > if security..capability
> > > >> > call capability /* set nscapability? */
> > > >> >
> > > >> > else if security.ima
> > > >> > call ima /* set ns_ima? */
> > > >> > }
> > > >>
> > > >> Hmm. I am confused about this part of the strategy.
> > > >>
> > > >> I don't understand the capability vs nscapability distinction. It seems
> > > >> to add complexity without benefit.
> > > >
> > > > ... Well, yes, we could simply make a new version of security.capability
> > > > xattr, and make rootid == 0 mean it was written by the init_user_ns. Is
> > > > that what you mean?
> > >
> > > Yes.
> > >
> > > That would seem to simplify the logic to ensure the policy we enforce is
> > > consistent with what is on disk.
> >
> > I'll give that a shot. I think the reason I did it this way was that I'm
> > still kind of stuck in the not-magic way of thinking about it. But yeah
> > with the kernel magically writing inthe kuid there's probably no reason not
> > to.
>
> Totally confused. Will this method allow multiple instances of the
> xattr on disk?

No, but we don't actually want that anyway. The current behavior for
security.capability is that it works in all user namespaces. So we
want to continue the behavior that if root in the init_user_ns sets a
capability, that works in all namespaces. Allowing other namespaces
to set the capability would only be confusing.

So in the patchset I had, security.capability can only be set by
init_user_ns but works in all namespaces. security.nscapability
cannot be set if secrity.capability is set. And security.nscapability
works in all child namespaces of the root uid which set the cap.

-serge