Re: [PATCH 0/4] pidfs: implement file handle support

From: Erin Shepherd
Date: Tue Nov 12 2024 - 18:03:42 EST



On 12/11/2024 14:10, Christian Brauner wrote:
> Sorry for the delayed reply (I'm recovering from a lengthy illness.).
No worries on my part, and I hope you're feeling better!
> I like the idea in general. I think this is really useful. A few of my
> thoughts but I need input from Amir and Jeff:
>
> * In the last patch of the series you already implement decoding of
> pidfd file handles by adding a .fh_to_dentry export_operations method.
>
> There are a few things to consider because of how open_by_handle_at()
> works.
>
> - open_by_handle_at() needs to be restricted so it only creates pidfds
> from pidfs file handles that resolve to a struct pid that is
> reachable in the caller's pid namespace. In other words, it should
> mirror pidfd_open().
>
> Put another way, open_by_handle_at() must not be usable to open
> arbitrary pids to prevent a container from constructing a pidfd file
> handle for a process that lives outside it's pid namespace
> hierarchy.
>
> With this restriction in place open_by_handle_at() can be available
> to let unprivileged processes open pidfd file handles.
>
> Related to that, I don't think we need to make open_by_handle_at()
> open arbitrary pidfd file handles via CAP_DAC_READ_SEARCH. Simply
> because any process in the initial pid namespace can open any other
> process via pidfd_open() anyway because pid namespaces are
> hierarchical.
>
> IOW, CAP_DAC_READ_SEARCH must not override the restriction that the
> provided pidfs file handle must be reachable from the caller's pid
> namespace.

The pid_vnr(pid) == 0 check catches this case -- we return an error to the
caller if there isn't a pid mapping in the caller's namespace

Perhaps I should have called this out explicitly.

> - open_by_handle_at() uses may_decode_fh() to determine whether it's
> possible to decode a file handle as an unprivileged user. The
> current checks don't make sense for pidfs. Conceptually, I think
> there don't need to place any restrictions based on global
> CAP_DAC_READ_SEARCH, owning user namespace of the superblock or
> mount on pidfs file handles.
>
> The only restriction that matters is that the requested pidfs file
> handle is reachable from the caller's pid namespace.

I wonder if this could be handled through an addition to export_operations'
flags member?

> - A pidfd always has exactly a single inode and a single dentry.
> There's no aliases.
>
> - Generally, in my naive opinion, I think that decoding pidfs file
> handles should be a lot simpler than decoding regular path based
> file handles. Because there should be no need to verify any
> ancestors, or reconnect paths. Pidfs also doesn't have directory
> inodes, only regular inodes. In other words, any dentry is
> acceptable.
>
> Essentially, the only thing we need is for exportfs_decode_fh_raw()
> to verify that the provided pidfs file handle is resolvable in the
> caller's pid namespace. If so we're done. The challenge is how to
> nicely plumb this into the code without it sticking out like a sore
> thumb.

Theoretically you should be able to use PIDFD_SELF as well (assuming that
makes its way into mainline this release :-)) but I am a bit concerned about
potentially polluting the open_by_handle_at logic with pidfd specificities.

> - Pidfs should not be exportable via NFS. It doesn't make sense.

Hmm, I guess I might have made that possible, though I'm certainly not
familiar enough with the internals of nfsd to be able to test if I've done
so.

I guess probably this case calls for another export_ops flag? Not like we're
short on them

Thanks,
    - Erin