Re: [RFC v2] ptrace, pidfd: add pidfd_ptrace syscall

From: Christian Brauner
Date: Mon Apr 27 2020 - 16:13:18 EST


On Mon, Apr 27, 2020 at 10:08:03PM +0200, Arnd Bergmann wrote:
> On Mon, Apr 27, 2020 at 8:59 PM Hagen Paul Pfeifer <hagen@xxxxxxxx> wrote:
> >
> > * Eric W. Biederman | 2020-04-27 13:18:47 [-0500]:
> >
> > >I am conflicted about that but I have to agree. Instead of
> > >duplicating everything it would be good enough to duplicate the once
> > >that cause the process to be attached to use. Then there would be no
> > >more pid races to worry about.
> >
> > >How does this differ using the tracing related infrastructure we have
> > >for the kernel on a userspace process? I suspect augmenting the tracing
> > >infrastructure with the ability to set breakpoints and watchpoints (aka
> > >stopping userspace threads and processes might be a more fertile
> > >direction to go).
> > >
> > >But I agree either we want to just address the races in PTRACE_ATTACH
> > >and PTRACE_SIEZE or we want to take a good hard look at things.
> > >
> > >There is a good case for minimal changes because one of the cases that
> > >comes up is how much work will it take to change existing programs. But
> > >ultimately ptrace pretty much sucks so a very good set of test cases and
> > >documentation for what we want to implement would be a very good idea.
> >
> > Hey Eric, Jann, Christian, Arnd,
> >
> > thank you for your valuable input! IMHO I think we have exactly two choices
> > here:
> >
> > a) we go with my patchset that is 100% ptrace feature compatible - except the
> > pidfd thing - now and in the future. If ptrace is extended pidfd_ptrace is
> > automatically extended and vice versa. Both APIs are feature identical
> > without any headaches.
> > b) leave ptrace completely behind us and design ptrace that we have always
> > dreamed of! eBPF filters, ftrace kernel architecture, k/uprobe goodness,
> > a speedy API to copy & modify large chunks of data, io_uring/epoll support
> > and of course: pidfd based (missed likely thousands of other dreams)
> >
> > I think a solution in between is not worth the effort! It will not be
> > compatible in any way for the userspace and the benefit will be negligible.
> > Ptrace is horrible API - everybody knows that but developers get comfy with
> > it. You find examples everywhere, why should we make it harder for the user for
> > no or little benefit (except that stable handle with pidfd and some cleanups)?
> >
> > Any thoughts on this?
>
> The way I understood Jann was that instead of a new syscall that duplicates
> everything in ptrace(), there would only need to be a new ptrace request
> such as PTRACE_ATTACH_PIDFD that behaves like PTRACE_ATTACH
> but takes a pidfd as the second argument, perhaps setting the return value
> to the pid on success. Same for PTRACE_SEIZE.

That was my initial suggestion, yes. Any enum that identifies a target
by a pid will get a new _PIDFD version and the pidfd is passed as pid_t
argument. That should work and is similar to what I did for waitid()
P_PIDFD. Realistically, there shouldn't be any system where pid_t is
smaller than an int that we care about.

>
> In effect this is not much different from your a), just a variation on the
> calling conventions. The main upside is that it avoids adding another
> ugly interface, the flip side is that it makes the existing one slightly worse
> by adding complexity.

Basically, if a new syscall than please a proper re-design with real
benefits.

In the meantime we could make due with the _PIDFD variant. And then if
someone wants to do the nitty gritty work of adding a ptrace variant
purely based on pidfds and with a better api and features that e.g. Jann
pointed out then by all means, please do so. I'm sure we would all
welcome this as well.

Christian