Re: [PATCH 0/4] pid: add pidctl()
From: Christian Brauner
Date: Mon Mar 25 2019 - 16:40:28 EST
On Mon, Mar 25, 2019 at 09:34:00PM +0100, Jann Horn wrote:
> On Mon, Mar 25, 2019 at 9:15 PM Daniel Colascione <dancol@xxxxxxxxxx> wrote:
> > On Mon, Mar 25, 2019 at 12:42 PM Jonathan Kowalski <bl0pbl33p@xxxxxxxxx> wrote:
> > > On Mon, Mar 25, 2019 at 6:57 PM Daniel Colascione <dancol@xxxxxxxxxx> wrote:
> [...]
> > > Yes, but everything in /proc is not equivalent to an attribute, or an
> > > option, and depending on its configuration, you may not want to allow
> > > processes to even be able to see /proc for any PIDs other than those
> > > running as their own user (hidepid). This means, even if this new
> > > system call is added, to respect hidepid, it must, depending on if
> > > /proc is mounted (and what hidepid is set to, and what gid= is set
> > > to), return EPERM, because then there is a discrepancy between how the
> > > two entrypoints to acquire a process handle do access control.
> >
> > That's why I proposed that this translation mechanism accept a procfs
> > root directory --- so you'd specify *which* procfs you want and let
> > the kernel apply whatever hidepid access restrictions it wants.
> [...]
> > > > and 2) it's
> > > > "fail unsafe": IMHO, most users in practice will skip the line marked
> > > > "LIVENESS CHECK", and as a result, their code will appear to work but
> > > > contain subtle race conditions. An explicit interface to translate
> > > > from a (PIDFD, PROCFS_ROOT) tuple to a /proc/pid directory file
> > > > descriptor would be both more efficient and fail-safe.
> > > >
> > > > [1] as a separate matter, it'd be nice to have a batch version of close(2).
> > >
> > > Since /proc is full of gunk,
> >
> > People keep saying /proc is bad, but I haven't seen any serious
> > proposals for a clean replacement. :-)
> >
> > > how about adding more to it and making
> > > the magic symlink of /proc/self/fd for the pidfd to lead to the dirfd
> > > of the /proc entry of the process it maps to, when one uses
> > > O_DIRECTORY while opening it? Otherwise, it behaves as it does today.
> > > It would be equivalent to opening the proc entry with usual access
> > > restrictions (and hidepid made to work) but without the races, and
> > > because for processes outside your and children pid ns, it shouldn't
> > > work anyway, and since they wouldn't have their entry on this procfs
> > > instance, it would all just fit in nicely?
> >
> > Thanks. That'll work. It's a bit magical, but /proc/self/fd is magical
> > anyway, so that's okay.
>
> Please don't do that. /proc/$pid/fd refers to the set of file
> descriptors the process has open, and semantically doesn't have much
> to do with the identity of the process. If you want to have a procfs
> directory entry for getting a pidfd, please add a new entry. (Although
> I don't see the point in adding a new procfs entry for this when you
> could instead have an ioctl or syscall operating on the procfs
> directory fd.)
Very much agreed!