Re: [RFC 1/3] pidfd: allow pidfd_open() on non-thread-group leaders

From: Mathieu Desnoyers
Date: Thu Nov 30 2023 - 14:00:09 EST


On 2023-11-30 13:54, Tycho Andersen wrote:
On Thu, Nov 30, 2023 at 07:37:02PM +0100, Florian Weimer wrote:
* Tycho Andersen:

From: Tycho Andersen <tandersen@xxxxxxxxxxx>

We are using the pidfd family of syscalls with the seccomp userspace
notifier. When some thread triggers a seccomp notification, we want to do
some things to its context (munge fd tables via pidfd_getfd(), maybe write
to its memory, etc.). However, threads created with ~CLONE_FILES or
~CLONE_VM mean that we can't use the pidfd family of syscalls for this
purpose, since their fd table or mm are distinct from the thread group
leader's. In this patch, we relax this restriction for pidfd_open().

Does this mean that pidfd_getfd cannot currently be used to get
descriptors for a TID if that TID doesn't happen to share its descriptor
set with the thread group leader?

Correct, that's what I'm trying to solve.

I'd like to offer a userspace API which allows safe stashing of
unreachable file descriptors on a service thread.

By "safe" here do you mean not accessible via pidfd_getfd()?

For the LTTng-UST use-case, we need to be able to create and
use a file descriptor from an agent thread injected within the target
process in a way that is safe against patterns where the application
blindly close all file descriptors (for-loop doing close(2),
closefrom(2) or closeall(2)).

The main issue here is that even though we could handle errors
(-1, errno=EBADF) in the sendmsg/recvmsg calls, re-use of a file
descriptor by the application can lead to data corruption, which
is certainly an unwanted consequence.

AFAIU glibc has similar requirements with respect to io_uring
file descriptors.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com