Re: [PATCH] samples: make pidfd-metadata fail gracefully on older kernels

From: Christian Brauner
Date: Thu Jun 20 2019 - 07:15:55 EST


On Thu, Jun 20, 2019 at 02:00:37PM +0300, Dmitry V. Levin wrote:
> Cc'ed more people as the issue is not just with the example but
> with the interface itself.
>
> On Thu, Jun 20, 2019 at 12:31:06PM +0200, Christian Brauner wrote:
> > On Thu, Jun 20, 2019 at 06:11:44AM +0300, Dmitry V. Levin wrote:
> > > Initialize pidfd to an invalid descriptor, to fail gracefully on
> > > those kernels that do not implement CLONE_PIDFD and leave pidfd
> > > unchanged.
> > >
> > > Signed-off-by: Dmitry V. Levin <ldv@xxxxxxxxxxxx>
> > > ---
> > > samples/pidfd/pidfd-metadata.c | 8 ++++++--
> > > 1 file changed, 6 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/samples/pidfd/pidfd-metadata.c b/samples/pidfd/pidfd-metadata.c
> > > index 14b454448429..ff109fdac3a5 100644
> > > --- a/samples/pidfd/pidfd-metadata.c
> > > +++ b/samples/pidfd/pidfd-metadata.c
> > > @@ -83,7 +83,7 @@ static int pidfd_metadata_fd(pid_t pid, int pidfd)
> > >
> > > int main(int argc, char *argv[])
> > > {
> > > - int pidfd = 0, ret = EXIT_FAILURE;
> > > + int pidfd = -1, ret = EXIT_FAILURE;
> >
> > Hm, that currently won't work since we added a check in fork.c for
> > pidfd == 0. If it isn't you'll get EINVAL.
>
> Sorry, I must've missed that check. But this makes things even worse.
>
> > This was done to ensure that
> > we can potentially extend CLONE_PIDFD by passing in flags through the
> > return argument.
> > However, I find this increasingly unlikely. Especially since the
> > interface would be horrendous and an absolute last resort.
> > If clone3() gets merged for 5.3 (currently in linux-next) we also have
> > no real need anymore to extend legacy clone() this way. So either wait
> > until (if) we merge clone3() where the check I mentioned is gone anyway,
> > or remove the pidfd == 0 check from fork.c in a preliminary patch.
> > Thoughts?
>
> Userspace needs a reliable way to tell whether CLONE_PIDFD is supported
> by the kernel or not.

Right, that's the general problem with legacy clone(): it ignores
unknown flags... clone3() will EINVAL you if you pass any flag it
doesn't know about.

For legacy clone you can pass

(CLONE_PIDFD | CLONE_DETACHED)

on all relevant kernels >= 2.6.2. CLONE_DETACHED will be silently
ignored by the kernel if specified in flags. But if you specify both
CLONE_PIDFD and CLONE_DETACHED on a kernel that does support CLONE_PIDFD
you'll get EINVALed. (We did this because we wanted to have the ability
to make CLONE_DETACHED reuseable with CLONE_PIDFD.)
Does that help?

>
> If CLONE_PIDFD is not supported, then pidfd remains unchanged.
>
> If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
> also remains unchanged, which effectively means that userspace must ensure
> that fd 0 is not closed when invoking CLONE_PIDFD. This is ugly.
>
> If we can assume that clone(CLONE_PIDFD) is not going to be extended,
> then I'm for removing the pidfd == 0 check along with recommending
> userspace to initialize pidfd with -1.

Right, I'm ok with that too.

Thanks!
Christian