Re: [PATCH] samples: make pidfd-metadata fail gracefully on older kernels

From: Dmitry V. Levin
Date: Fri Jun 21 2019 - 13:06:20 EST


On Thu, Jun 20, 2019 at 01:10:37PM +0200, Christian Brauner wrote:
> On Thu, Jun 20, 2019 at 02:00:37PM +0300, Dmitry V. Levin wrote:
> > Cc'ed more people as the issue is not just with the example but
> > with the interface itself.
> >
> > On Thu, Jun 20, 2019 at 12:31:06PM +0200, Christian Brauner wrote:
> > > On Thu, Jun 20, 2019 at 06:11:44AM +0300, Dmitry V. Levin wrote:
> > > > Initialize pidfd to an invalid descriptor, to fail gracefully on
> > > > those kernels that do not implement CLONE_PIDFD and leave pidfd
> > > > unchanged.
> > > >
> > > > Signed-off-by: Dmitry V. Levin <ldv@xxxxxxxxxxxx>
> > > > ---
> > > > samples/pidfd/pidfd-metadata.c | 8 ++++++--
> > > > 1 file changed, 6 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/samples/pidfd/pidfd-metadata.c b/samples/pidfd/pidfd-metadata.c
> > > > index 14b454448429..ff109fdac3a5 100644
> > > > --- a/samples/pidfd/pidfd-metadata.c
> > > > +++ b/samples/pidfd/pidfd-metadata.c
> > > > @@ -83,7 +83,7 @@ static int pidfd_metadata_fd(pid_t pid, int pidfd)
> > > >
> > > > int main(int argc, char *argv[])
> > > > {
> > > > - int pidfd = 0, ret = EXIT_FAILURE;
> > > > + int pidfd = -1, ret = EXIT_FAILURE;
> > >
> > > Hm, that currently won't work since we added a check in fork.c for
> > > pidfd == 0. If it isn't you'll get EINVAL.
> >
> > Sorry, I must've missed that check. But this makes things even worse.
> >
> > > This was done to ensure that
> > > we can potentially extend CLONE_PIDFD by passing in flags through the
> > > return argument.
> > > However, I find this increasingly unlikely. Especially since the
> > > interface would be horrendous and an absolute last resort.
> > > If clone3() gets merged for 5.3 (currently in linux-next) we also have
> > > no real need anymore to extend legacy clone() this way. So either wait
> > > until (if) we merge clone3() where the check I mentioned is gone anyway,
> > > or remove the pidfd == 0 check from fork.c in a preliminary patch.
> > > Thoughts?
> >
> > Userspace needs a reliable way to tell whether CLONE_PIDFD is supported
> > by the kernel or not.
>
> Right, that's the general problem with legacy clone(): it ignores
> unknown flags... clone3() will EINVAL you if you pass any flag it
> doesn't know about.
>
> For legacy clone you can pass
>
> (CLONE_PIDFD | CLONE_DETACHED)
>
> on all relevant kernels >= 2.6.2. CLONE_DETACHED will be silently
> ignored by the kernel if specified in flags. But if you specify both
> CLONE_PIDFD and CLONE_DETACHED on a kernel that does support CLONE_PIDFD
> you'll get EINVALed. (We did this because we wanted to have the ability
> to make CLONE_DETACHED reuseable with CLONE_PIDFD.)
> Does that help?

Yes, this is feasible, but the cost is extra syscall for new kernels
and more complicated userspace code, so...

> > If CLONE_PIDFD is not supported, then pidfd remains unchanged.
> >
> > If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
> > also remains unchanged, which effectively means that userspace must ensure
> > that fd 0 is not closed when invoking CLONE_PIDFD. This is ugly.
> >
> > If we can assume that clone(CLONE_PIDFD) is not going to be extended,
> > then I'm for removing the pidfd == 0 check along with recommending
> > userspace to initialize pidfd with -1.
>
> Right, I'm ok with that too.

... I'd prefer this variant.


--
ldv

Attachment: signature.asc
Description: PGP signature