Re: pidfd design

From: Michael Tirado
Date: Mon Mar 25 2019 - 16:44:44 EST


On Mon, Mar 25, 2019 at 5:45 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, Mar 22, 2019 at 11:34 AM Michael Tirado <mtirado418@xxxxxxxxx> wrote:
> >
> > On Wed, Mar 20, 2019 at 8:08 PM Alexey Dobriyan <adobriyan@xxxxxxxxx> wrote:
> > >
> > > pidfd code should be backed out immediately. Forget about /proc.
> >
> > Seems like Torvalds just merges this sort of "stuff" without reading
> > it now, or there's something that auto accepted pull request to RC tree?
>
> There is no auto-accept.
>
> But there also didn't seem to be any valid arguments against it, and
> the android people had arguments for it.
>

Isn't Google working on their own C++ kernel now, I bet they would want
to make a smooth transition to that at some point? Hopefully they don't
screw up Linux in the process.


> Arguing against it based on "I don't like /proc" is pointless. The
> fact is, /proc is our system interface for a lot of things.
>

The argument was valid to me, at least the design is not set in
stone just yet and there is still hope. I have an option in my
namespace sandbox called "noproc", it works for many things, but
if devs start relying on /proc ALWAYS being available I begin to
have issues. You are all aware of the horrors of /proc, I hope.
I don't want /proc so deeply entrenched in the ecosystem that I
can no longer use "noproc". These sort of bold new core features
need to be designed with extreme caution and awareness of the full
spectra of affects. Just because something like procfs exists and
can be used doesn't mean it is wise to go all-in.


> Arguing against it based on "I worry about the _other_
> non-signal-sending things that could be done with this" is also
> pointless. What other things? The only thing that got merged was the
> signalling.
>

There have been "future changes" hinted through the patches lifecycle,
it leads me to believe it's a gateway patch, and the pid wrapping is a
minor bugfix bridge to some other undisclosed features. How could anyone
know the design is right without knowing what these changes might be?

pidctl/translate_pid?
I am against any new systemcall that crosses namespaces by design to
accomplish something that is already plummable. Seems like they want
to use pidfd's as some sort of token to perform these cross namespace
operations, can't wait to see how devs end up abusing this one.


> So the model of using a file descriptor instead of a 'pid' for signal
> handling is actually very unix-like. Maybe that's how pid's should
> have worked to begin with. Remember that whole "everything is a file"
> thing?
>

Perhaps it could be called an improvement if yall get it right because
AFAIK the only way to handle wrapping today is to directly clone the PID
you're worried about and deal with it immediately when the process exits
before wrap can happen. But I really wonder why PID wrapping matters SO
much, I bet some people are doing dangerously stupid things like using
PID as a credential even though everyone knows it wraps.

Maybe this can make signalling less racey somehow?
At the very least you could learn the process has exited instead of
blindly acting on a potentially recycled number. I recognize the value
in that specifically.
However, using pidfd as a token to do cross-namespace activities that
are already plummable is just plain weird to me, but maybe I'm too used
to doing things "the hard way".