Re: [PATCH] vfs: relax linkat() AT_EMPTY_PATH - aka flink() - requirements

From: Charles Mirabile
Date: Thu Apr 11 2024 - 14:20:17 EST


Here is an updated version of linus's patch that makes the check
namespace relative
---
fs/namei.c | 17 ++++++++++++-----
include/linux/namei.h | 1 +
2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 4e0de939fea1..2bcc10976ba7 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2419,6 +2419,14 @@ static const char *path_init(struct nameidata
*nd, unsigned flags)
if (!f.file)
return ERR_PTR(-EBADF);

+ if (flags & LOOKUP_DFD_MATCH_CREDS) {
+ if (f.file->f_cred != current_cred() &&
+ !ns_capable(f.file->f_cred->user_ns, CAP_DAC_READ_SEARCH)) {
+ fdput(f);
+ return ERR_PTR(-ENOENT);
+ }
+ }
+
dentry = f.file->f_path.dentry;

if (*s && unlikely(!d_can_lookup(dentry))) {
@@ -4640,14 +4648,13 @@ int do_linkat(int olddfd, struct filename
*old, int newdfd,
goto out_putnames;
}
/*
- * To use null names we require CAP_DAC_READ_SEARCH
+ * To use null names we require CAP_DAC_READ_SEARCH or
+ * that the open-time creds of the dfd matches current.
* This ensures that not everyone will be able to create
* handlink using the passed filedescriptor.
*/
- if (flags & AT_EMPTY_PATH && !capable(CAP_DAC_READ_SEARCH)) {
- error = -ENOENT;
- goto out_putnames;
- }
+ if (flags & AT_EMPTY_PATH)
+ how |= LOOKUP_DFD_MATCH_CREDS;

if (flags & AT_SYMLINK_FOLLOW)
how |= LOOKUP_FOLLOW;
diff --git a/include/linux/namei.h b/include/linux/namei.h
index 74e0cc14ebf8..678ffe4acf99 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -44,6 +44,7 @@ enum {LAST_NORM, LAST_ROOT, LAST_DOT, LAST_DOTDOT};
#define LOOKUP_BENEATH 0x080000 /* No escaping from starting point. */
#define LOOKUP_IN_ROOT 0x100000 /* Treat dirfd as fs root. */
#define LOOKUP_CACHED 0x200000 /* Only do cached lookup */
+#define LOOKUP_DFD_MATCH_CREDS 0x400000 /* Require that dfd creds
match current */
/* LOOKUP_* flags which do scope-related checks based on the dirfd. */
#define LOOKUP_IS_SCOPED (LOOKUP_BENEATH | LOOKUP_IN_ROOT)

--
2.44.0

On Thu, Apr 11, 2024 at 12:44 PM Charles Mirabile <cmirabil@xxxxxxxxxx> wrote:
>
> On Thu, Apr 11, 2024 at 12:15 PM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, 11 Apr 2024 at 02:05, Christian Brauner <brauner@xxxxxxxxxx> wrote:
> > >
> > > I had a similar discussion a while back someone requested that we relax
> > > permissions so linkat can be used in containers.
> >
> > Hmm.
> >
> > Ok, that's different - it just wants root to be able to do it, but
> > "root" being just in the container itself.
> >
> > I don't think that's all that useful - I think one of the issues with
> > linkat(AT_EMPTY_PATH) is exactly that "it's only useful for root",
> > which means that it's effectively useless. Inside a container or out.
> >
> > Because very few loads run as root-only (and fewer still run with any
> > capability bits that aren't just "root or nothing").
> >
> > Before I did all this, I did a Debian code search for linkat with
> > AT_EMPTY_PATH, and it's almost non-existent. And I think it's exactly
> > because of this "when it's only useful for root, it's hardly useful at
> > all" issue.
> >
> > (Of course, my Debian code search may have been broken).
> >
> > So I suspect your special case is actually largely useless, and what
> > the container user actually wanted was what my patch does, but they
> > didn't think that was possible, so they asked to just extend the
> > "root" notion.
> >
> Yes, that is absolutely the case. When Christian poked holes in my
> initial submission, I started working on a v2 but haven't had a chance
> to send it because I wanted to make sure my arguments etc were
> airtight because I am well aware of just how long and storied the
> history of flink is. In the v2 that I didn't post yet, I extended the
> ability to link *any* file from only true root to also "root" within a
> container following the potentially suspect approach that christian
> suggested (I see where you are coming from with the unobvious approach
> to avoiding toctou though I obviously support this patch being
> merged), but I added a followup patch that checks for whether the file
> in question is an `__O_TMPFILE` file which is trivial once we are
> jumping through the hoops of looking up the struct file. If it is a
> tmpfile (i.e. linkcount = 0) I think that all the concerns raised
> about processes that inherit a fd being able to create links to the
> file inappropriately are moot. Here is an excerpt from the cover
> letter I was planning to send that explains my reasoning.
>
> - the ability to create an unnamed file, adjust its contents
> and attributes (i.e. uid, gid, mode, xattrs, etc) and then only give it a name
> once they are ready is exactly the ideal use-case for a hypothetical `flink`
> system call. The ability to use `AT_EMPTY_PATH` with `linkat` essentially turns
> it into `flink`, but these restrictions hamper it from actually being used, as
> most code is not privileged. By checking whether the file to be linked is a
> temporary (i.e. unnamed) file we can allow this useful case without allowing
> the more risky cases that could have security implications.
>
> - Although it might appear that the security posture is affected, it is not.
> Callers who open an extant file on disk in read only mode for sharing with
> potentially untrustworthy code can trust that this change will not affect them
> because it only applies to temporary files. Callers who open a temporary file
> need to do so with write access if they want to actually put any data in it,
> so they will already have to reopen the file (e.g. by linking it to a path
> and opening that path, or opening the magic symlink in proc) if they want to
> share it in read-only mode with untrustworthy code. As such, that new file
> description will no longer be marked as a temporary file and thus these changes
> do not apply. The final case I can think of is the unlikely possibility of
> intentionally sharing read write access to data stored in a temporary file with
> untrustworthy code, but where that code being able to make a link would still
> be deleterious. In such a bizarre case, the `O_EXCL` can and should be used to
> fully prevent the temporary file from ever having a name, and this change does
> not alter its efficacy.
>
> > I've added Charles to the Cc.
> >
> > But yes, with my patch, it would now be trivial to make that
> >
> > capable(CAP_DAC_READ_SEARCH)
> >
> > test also be
> >
> > ns_capable(f.file->f_cred->user_ns, CAP_DAC_READ_SEARCH)
> >
> > instead. I suspect not very many would care any more, but it does seem
> > conceptually sensible.
> >
> > As to your patch - I don't like your nd->root games in that patch at
> > all. That looks odd.
> >
> > Yes, it makes lookup ignore the dfd (so you avoid the TOCTOU issue),
> > but it also makes lookup ignore "/". Which happens to be ok with an
> > empty path, but still...
> >
> > So it feels to me like that patch of yours mis-uses something that is
> > just meant for vfs_path_lookup().
> >
> > It may happen to work, but it smells really odd to me.
> >
> > Linus
> >