Re: [PATCH v6 4/5] seccomp: add support for passing fds via USER_NOTIF

From: Jann Horn
Date: Mon Sep 10 2018 - 13:01:14 EST


On Thu, Sep 6, 2018 at 8:30 PM Tycho Andersen <tycho@xxxxxxxx> wrote:
> On Thu, Sep 06, 2018 at 10:22:46AM -0600, Tycho Andersen wrote:
> > On Thu, Sep 06, 2018 at 06:15:18PM +0200, Jann Horn wrote:
> > > On Thu, Sep 6, 2018 at 5:29 PM Tycho Andersen <tycho@xxxxxxxx> wrote:
> > > > The idea here is that the userspace handler should be able to pass an fd
> > > > back to the trapped task, for example so it can be returned from socket().
> > > [...]
> > > > diff --git a/Documentation/userspace-api/seccomp_filter.rst b/Documentation/userspace-api/seccomp_filter.rst
> > > > index d1498885c1c7..1c0aab306426 100644
> > > > --- a/Documentation/userspace-api/seccomp_filter.rst
> > > > +++ b/Documentation/userspace-api/seccomp_filter.rst
> > > > @@ -235,6 +235,9 @@ The interface for a seccomp notification fd consists of two structures:
> > > > __u64 id;
> > > > __s32 error;
> > > > __s64 val;
> > > > + __u8 return_fd;
> > > > + __u32 fd;
> > > > + __u32 fd_flags;
> > >
> > > Normally, syscalls that take an optional file descriptor accept a
> > > signed 32-bit number, with -1 standing for "no file descriptor". Is
> > > there a reason why this uses a separate variable to signal whether an
> > > fd was provided?
> >
> > No real reason other than I looked at the bpf code and they were using
> > __u32 for bpf (but I think in their case the fd args are not
> > optional). I'll switch it to __s32/-1 for the next version.
>
> Oh, I think there is a reason actually: since this is an API addition,
> the "0" value needs to be the previously default behavior if userspace
> doesn't specify it. Since the previously default behavior was not to
> return an fd, and we want to allow fd == 0, we need the extra flag to
> make this work.
>
> This is really only a problem because we're introducing this stuff in
> a second patch (mostly to illustrate how extending the response
> structure would work). I could fold this into the first patch if we
> want, or we could keep the return_fd bits if the illustration is
> useful.

I feel like adding extra struct fields just so that it is possible to
write programs against the intermediate new API between two kernel
commits is taking things a bit far.