Re: [PATCH] vfs: relax linkat() AT_EMPTY_PATH - aka flink() - requirements
From: Charles Mirabile
Date: Thu Apr 11 2024 - 16:08:45 EST
On Thu, Apr 11, 2024 at 2:14 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, 11 Apr 2024 at 10:35, Charles Mirabile <cmirabil@xxxxxxxxxx> wrote:
> >
> > And a slightly dubious addition to bypass these checks for tmpfiles
> > across the board.
>
> Does this make sense?
>
> I 100% agree that one of the primary reasons why people want flink()
> is that "open tmpfile, finalize contents and permissions, then link
> the final result into the filesystem".
>
> But I would expect that the "same credentials as open" check is the
> one that really matters.
Certainly. I think that in almost all cases, the pattern of preparing
a file and then using linkat to give it its final name will occur
without changing creds and as such your patch will fix it. When
combined with making the CAP_DAC_READ_SEARCH override aware of
namespaces, I think that covers almost all of the remaining edges
cases (i.e. if the difference in creds was actually you becoming more
privileged and not less, then sure you can do it).
The only possibility that remains is a difference in creds where the
new creds are not privileged. This is the maybe scary situation that
has blocked flink in the past. All I am suggesting is that you can
decompose this niche situation still further into the case where the
file was opened "ordinarily" in some sense which is the case that is
really concerning (oops, when I forked and dropped privileges before
exec, I forgot to set cloexec on one of my fds that points to
something important, and even though I opened it with O_RDONLY, the
unprivileged code is able to make a new link to it in a directory
they control and re-open with O_RDWR because the mode bits allow say
o+w (extremely bizarre and honestly hypothetical in my opinion, but
ok)) and other special situations.
Those situations namely being O_PATH and O_TMPFILE where by
specifying these special flags during open you are indicating that
you intend to do something special with the file descriptor. I think
if either of these flags are present in the file flags, then we are
not in the concerning case, and I think it could be appropriate to
bypass the matching creds check.
>
> And __O_TMPFILE is just a special case that might not even be used -
> it's entirely possible to just do the same with a real file (ie
> non-O_TMPFILE) and link it in place and remove the original.
>
> Not to mention that ->tmpfile() isn't necessarily even available, so
> the whole concept of "use O_TMPFILE and then linkat" is actually
> broken. It *has* to be able to fall back to a regular file to work at
> all on NFS.
>
> So while I understand your motivation, I actually think it's actively
> wrong to special-case __O_TMPFILE, because it encourages a pattern
> that is bad.
The problem with this is that another process might be able to access
the file during via that name during the brief period before it is
unlinked. If I am not using NFS, I am always going to prefer using
O_TMPFILE. I would rather be able to do that without restriction even
if it isn't the most robust solution by your definition.
In my opinion I think it is more robust in the sense that it is truly
atomic and making a named file is the kludge that you have to do to
work around NFS limitations, but I agree that this is a tiny detail
that I certainly do not want to block this patch over because it
already solves the problem I was actually dealing with. Whether or
not it solves this hypothetical problem is less important.
>
> Linus
>