Re: [PATCH v4 2/5] pid: Add PIDFD_IOCTL_GETFD to fetch file descriptors from processes

From: Arnd Bergmann
Date: Sat Dec 21 2019 - 08:53:58 EST


On Fri, Dec 20, 2019 at 4:35 AM Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
>
> On 2019-12-19, Sargun Dhillon <sargun@xxxxxxxxx> wrote:
> > On Thu, Dec 19, 2019 at 2:35 AM Christian Brauner
> > <christian.brauner@xxxxxxxxxx> wrote:
> > > I guess this is the remaining question we should settle, i.e. what do we
> > > prefer.
> > > I still think that adding a new syscall for this seems a bit rich. On
> > > the other hand it seems that a lot more people agree that using a
> > > dedicated syscall instead of an ioctl is the correct way; especially
> > > when it touches core kernel functionality. I mean that was one of the
> > > takeaways from the pidfd API ioctl-vs-syscall discussion.
> > >
> > > A syscall is nicer especially for core-kernel code like this.
> > > So I guess the only way to find out is to try the syscall approach and
> > > either get yelled and switch to an ioctl() or have it accepted.
> > >
> > > What does everyone else think? Arnd, still in favor of a syscall I take
> > > it. Oleg, you had suggested a syscall too, right? Florian, any
> > > thoughts/worries on/about this from the glibc side?
> > >
> > > Christian
> >
> > My feelings towards this are that syscalls might pose a problem if we
> > ever want to extend this API. Of course we can have a reserved
> > "flags" field, and populate it later, but what if we turn out to need
> > a proper struct? I already know we're going to want to add one
> > around cgroup metadata (net_cls), and likely we'll want to add
> > a "steal" flag as well. As Arnd mentioned earlier, this is trivial to
> > fix in a traditional ioctl environment, as ioctls are "cheap". How
> > do we feel about potentially adding a pidfd_getfd2? Or are we
> > confident that reserved flags will save us?
>
> If we end up making this a syscall, then we can re-use the
> copy_struct_from_user() API to make it both extensible and compatible in
> both directions. I wasn't aware that this was frowned upon for ioctls
> (sorry for the extra work) but there are several syscalls which use this
> model for extendability (clone3, openat2, sched_setattr,
> perf_events_open) so there shouldn't be any such complaints for a
> syscall which is extensible.

I would still not do it for syscalls, although for other reasons:

- in an ioctl, it's better to come up with a new command code if you
have a larger structure

- in a system call, it's best to pass all arguments as individual
registers, the only time we use indirect data structures is when there
are more than six arguments.

Arnd