ptrace() - Tracing the wrong thread after TID recycling

From: Olivier Dion
Date: Wed Aug 24 2022 - 17:25:02 EST


Hi,

There's some cases where it is not currently possible to ensure that the
ptrace commands sent to a thread is the correct target because of TID
recycling. Usually this is not a problem if the tracer is the direct
parent of the tracee with the help of PTRACE_O_TRACECLONE.

However, in the cases of tracing sibling threads, one has to fork
another process and now the roles have exchanged. If the tracer wants
to attach to already running sibling threads of its parent, it has to
scan `/proc/[ppid]/task` to get the tids. However, it's possible for
these tids to be reuse by another process by the time the tracer attach
itself. Resulting in tracing the wrong threads. The obvious solution
would be to pass the thread group id to ptrace() like tgkill() has.

This RFC <https://lkml.org/lkml/2020/4/26/253> seems to address this
issue. Although IIRC pidfd only applies to PID and not to TID. So the
problem remains.

An ad-hoc solution I've come up with is to `open(/proc/[self]/task)' in
the tracee before forking the tracer. Then, the tracer will
ptrace(PTRACE_ATTACH) itself to the desired threads found in the
directory by scanning it. Assuming that the attach worked, the tracer
will then do a openat(O_PATH) on the same directory with the thread id
as the pathname. If the call failed, it means that thread is not a
sibling of our parent and the tracer can detach itself.

Thought?

--
Olivier Dion
oldiob.dev