Re: [PATCH] namei: permit linking with CAP_FOWNER in userns

From: Serge Hallyn
Date: Wed Oct 28 2015 - 13:33:18 EST


Quoting Dirk Steinmetz (public@xxxxxxxxxxxxxxxxxx):
> On Tue, 27 Oct 2015 20:28:02 +0000, Serge Hallyn wrote:
> > Quoting Dirk Steinmetz (public@xxxxxxxxxxxxxxxxxx):
> > > On Tue, 27 Oct 2015 09:33:44 -0500, Seth Forshee wrote:
> > > > I did want to point what seems to be an inconsistency in how
> > > > capabilities in user namespaces are handled with respect to inodes. When
> > > > I started looking at this my initial thought was to replace
> > > > capable(CAP_FOWNER) with capable_wrt_inode_uidgid(inode, CAP_FOWNER). On
> > > > the face of it this should be equivalent to what's done here, but it
> > > > turns out that capable_wrt_inode_uidgid requires that the inode's uid
> > > > and gid are both mapped into the namespace whereas
> > > > inode_owner_or_capable only requires the uid be mapped. I'm not sure how
> > > > significant that is, but it seems a bit odd.
> > >
> > > I agree that this seems odd. I've chosen inode_owner_or_capable over
> > > capable_wrt_inode_uidgid(inode, CAP_FOWNER) as it seemed consistent:
> > > a privileged user (with CAP_SETUID) can impersonate the owner UID and thus
> > > bypass the check completely; this also matches the documented behavior of
> > > CAP_FOWNER: "Bypass permission checks on operations that normally require
> > > the filesystem UID of the process to match the UID of the file".
> > >
> > > However, thinking about it I get the feeling that checking the gid seems
> > > reasonable as well. This is, however, independently of user namespaces.
> > > Consider the following scenario in any namespace, including the init one:
> > > - A file has the setgid and user/group executable bits set, and is owned
> > > by user:group.
> > > - The user 'user' is not in the group 'group', and does not have any
> > > capabilities.
> > > - The user 'user' hardlinks the file. The permission check will succeed,
> > > as the user is the owner of the file.
> > > - The file is replaced with a newer version (for example fixing a security
> > > issue)
> > > - Now user can still use the hardlink-pinned version to execute the file
> > > as 'user:group' (and for example exploit the security issue).
> > > I would have expected the user to not be able to hardlink, as he lacks
> > > CAP_FSETID, and thus is not allowed to chmod, change or move the file
> > > without loosing the setgid bit. So it is impossible for him to make a non-
> > > hardlink copy with the setgid bit set -- why should he be able to make a
> > > hardlinked one?
> >
> > Yeah, this sounds sensible. It allows a user without access to 'disk',
> > for instance, to become that group.
> >
> > > It seems to me as if may_linkat would additionally require a check
> > > verifying that either
> > > - not both setgid and group executable bit set
> > > - fsgid == owner gid
> > > - capable_wrt_inode_uidgid(CAP_FSETID) -- or CAP_FOWNER, depending on
> > > whether to adapt chmod's behavior or keeping everything hardlink-
> > > related in CAP_FOWNER; I don't feel qualified enough to pick ;)
> >
> > In particular just changing it is not ok since people who are using file
> > capabilities to grant what they currently need would be stuck with a
> > mysterious new failure.
>
> Is there any use case (besides exploiting hardlinks with malicious intent)
> that would be broken when changing this? There are some (imho) rather
> unlikely conditions to be met in order to observe changed behavior:

The simplest example would be if I wanted to run a very quick program to
just add the symbolic link. Let's say the link /usr/sbin/uuidd were owned
by root:disk and setuid and setgid. The proposed change would force me
to bind in both the root user and disk group, whereas without it I can
just bind in only the root user.

We've already dealt with such regressions and iirc agreed that they were
worthwhile.

> - a user owns an executable setgid-file belonging to a group he is not in
> - the user does not have CAP_FSETID (or CAP_FOWNER, depending on which one
> is chosen to be required)
> - the user is for some legitimate reason supposed to hardlink the file
> If these conditions are not met in practice, the change would not break
> anything. In that case, it would be imho better to not provide
> backward-compatibility to reduce complexity in these checks. Else, I'd
> propose adding a new possible value '2' for
> /proc/sys/fs/protected_hardlinks, while keeping '1' for the current
> behavior.
>
> I can provide an initial draft for either of the options, but would like
> recommendations to which of the two ways to take (or is there a third
> one?), as well as comments on the new condition itself: may_linkat would
> block hardlinks when all of the following conditions are met:
> - sysctrl_protected_hardlinks is enabled or 2 (depending on way)
> - inode uid != fsuid and no CAP_FOWNER (for userns: with mapping on uid),
> while the hardlink source is not a regular file, is a setuid-executable
> or is not accessible for reading and writing
> - inode gid not fsgid or in supplemental gids and no CAP_FSETID (for
> userns: with mapping on gid -- not sure whether the uid is relevant?),
> while the hardlink source is a setgid-executable (with group executable
> bit set)
>
> If anyone else wants to fix the issue, thats fine with me as well.
>
> Dirk
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/