Re: [PATCH] namei: permit linking with CAP_FOWNER in userns
From: Seth Forshee
Date: Tue Oct 27 2015 - 10:34:26 EST
On Tue, Oct 20, 2015 at 04:09:19PM +0200, Dirk Steinmetz wrote:
> Attempting to hardlink to an unsafe file (e.g. a setuid binary) from
> within an unprivileged user namespace fails, even if CAP_FOWNER is held
> within the namespace. This may cause various failures, such as a gentoo
> installation within a lxc container failing to build and install specific
> This change permits hardlinking of files owned by mapped uids, if
> CAP_FOWNER is held for that namespace. Furthermore, it improves consistency
> by using the existing inode_owner_or_capable(), which is aware of
> namespaced capabilities as of 23adbe12ef7d3 ("fs,userns: Change
> inode_capable to capable_wrt_inode_uidgid").
> Signed-off-by: Dirk Steinmetz <public@xxxxxxxxxxxxxxxxxx>
Tested-by: Seth Forshee <seth.forshee@xxxxxxxxxxxxx>
This is hitting us in Ubuntu during some dpkg upgrades in containers.
When upgrading a file dpkg creates a hard link to the old file to back
it up before overwriting it. When packages upgrade suid files owned by a
non-root user the link isn't permitted, and the package upgrade fails.
This patch fixes our problem.
I did want to point what seems to be an inconsistency in how
capabilities in user namespaces are handled with respect to inodes. When
I started looking at this my initial thought was to replace
capable(CAP_FOWNER) with capable_wrt_inode_uidgid(inode, CAP_FOWNER). On
the face of it this should be equivalent to what's done here, but it
turns out that capable_wrt_inode_uidgid requires that the inode's uid
and gid are both mapped into the namespace whereas
inode_owner_or_capable only requires the uid be mapped. I'm not sure how
significant that is, but it seems a bit odd.
> This is the third time I'm sending the patch, as the first two attempts did
> not provoke a reply. Feel free to point out any issues you see with it --
> including formal requirements, as this is the first patch I'm submitting.
> I'd really appreciate your time.
> Maybe a bit of rationale behind it would be helpful as well: some linux
> distributions, especially gentoo in which I discovered the behaviour,
> rely on root being able to hardlink arbitrary files. In the case of gentoo,
> this happens when building and installing 'man': the built binary has the
> suid-flag set and is owned by a user 'man'. The installation script
> (running as root) then attempts to insert a hardlink towards that binary.
> Thanks to user namespaces, a regular user can use subuids to create a user
> namespace, and acquire root-like capabilities within said namespace. It is
> then possible to install and use arbitrary linux distributions within such
> namespaces. When installing gentoo in that manner, building and installing
> 'man' fails, as may_linkat checks the capabilities in the init namespace,
> where the installation process is owned by a regular user.
> In my opinion may_linkat should permit linking in this case, as the file to
> link to is owned by one of the regular user's mapped subids. Note that, in
> the scenario described above, it is already possible to create the hardlink
> through other means (the following listing is from an unprivileged user
> > # cat /proc/$$/status | grep CapEff
> > CapEff: 0000003cfdfeffff
> > # ls -l
> > total 0
> > -rwSr--r-- 1 nobody nobody 0 Oct 20 15:40 file
> > # ln file link
> > ln: failed to create hard link 'link' => 'file': Operation not permitted
> > # su nobody -s /bin/bash -c "ln file link"
> > # ls -l
> > total 0
> > -rwSr--r-- 2 nobody nobody 0 Oct 20 15:40 file
> > -rwSr--r-- 2 nobody nobody 0 Oct 20 15:40 link
> As you can see, the process has CAP_FOWNER in the namespace, but cannot
> hardlink the file owned by 'nobody'. It can, however, use su to switch to
> 'nobody' and then create the link. After applying this patch, linking
> works as expected.
> fs/namei.c | 7 ++-----
> 1 file changed, 2 insertions(+), 5 deletions(-)
> diff --git a/fs/namei.c b/fs/namei.c
> index 33e9495..0d3340b 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -955,26 +955,23 @@ static bool safe_hardlink_source(struct inode *inode)
> * - sysctl_protected_hardlinks enabled
> * - fsuid does not match inode
> * - hardlink source is unsafe (see safe_hardlink_source() above)
> - * - not CAP_FOWNER
> + * - not CAP_FOWNER in a namespace with the inode owner uid mapped
> * Returns 0 if successful, -ve on error.
> static int may_linkat(struct path *link)
> - const struct cred *cred;
> struct inode *inode;
> if (!sysctl_protected_hardlinks)
> return 0;
> - cred = current_cred();
> inode = link->dentry->d_inode;
> /* Source inode owner (or CAP_FOWNER) can hardlink all they like,
> * otherwise, it must be a safe source.
> - if (uid_eq(cred->fsuid, inode->i_uid) || safe_hardlink_source(inode) ||
> - capable(CAP_FOWNER))
> + if (inode_owner_or_capable(inode) || safe_hardlink_source(inode))
> return 0;
> audit_log_link_denied("linkat", link);
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/