Re: [PATCH v2 1/3] namei: implement O_BENEATH-style AT_* flags

From: Aleksa Sarai
Date: Wed Oct 10 2018 - 03:29:03 EST

On 2018-10-10, Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
> On 2018-10-09, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> > On Mon, Oct 8, 2018 at 11:53 PM Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
> > > * AT_NO_PROCLINK: Disallows ->get_link "symlink" jumping. This is a very
> > > specific restriction, and it exists because /proc/$pid/fd/...
> > > "symlinks" allow for access outside nd->root and pose risk to
> > > container runtimes that don't want to be tricked into accessing a host
> > > path (but do want to allow no-funny-business symlink resolution).
> >
> > Can you elaborate on the use case?
> >
> [...]
> I think that AT_BENEATH allowing only proclinks that result in you
> being under the root is something we might want in the future, but I
> think there are some cases where you want to be _very_ sure you don't
> follow a proclink (now or in the future).
> [...]

Sorry, just to clarify this point a bit more.

At the moment, "proclinks" are entirely disabled with AT_BENEATH. This
is a (hopefully) temporary measure until it's decided _how_ they should
be allowed. Personally I think we should allow them if they follow the
same requirement as ".." escapes (that __d_path can resolve them).

But then the question arises -- what if we're looking at a never-mounted
pseudo-filesystem dentry (see the ->d_dname code in d_path)? If we don't
allow it then we'd probably disallow quite a few cases where you'd want
to allow access (nsfs proclinks come immediately to mind).

*But* if we allow it then there's no real way to tell if the container
process has tricked us into opening something we shouldn't (like an open
file descriptor to a memfd or pipe related to some host service). Maybe
we should still allow them in that case because the likelihood of such a
case is very small (and allowing them would let you open nsfs links with
AT_BENEATH), but I'm not sure.

