Re: [PATCH] seccomp: Add group_leader pid to seccomp_notif

From: Sargun Dhillon
Date: Sun May 17 2020 - 07:18:00 EST


On Sun, May 17, 2020 at 12:17 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
> On Fri, May 15, 2020 at 04:40:05PM -0700, Sargun Dhillon wrote:
> > This includes the thread group leader ID in the seccomp_notif. This is
> > immediately useful for opening up a pidfd for the group leader, as
> > pidfds only work on group leaders.
> >
> > Previously, it was considered to include an actual pidfd in the
> > seccomp_notif structure[1], but it was suggested to avoid proliferating
> > mechanisms to create pidfds[2].
> >
> > [1]: https://lkml.org/lkml/2020/1/24/133
> > [2]: https://lkml.org/lkml/2020/5/15/481
>
> nit: please use lore.kernel.org/lkml/ URLs
>
> > Suggested-by: Christian Brauner <christian.brauner@xxxxxxxxxx>
> > Signed-off-by: Sargun Dhillon <sargun@xxxxxxxxx>
> > ---
> > include/uapi/linux/seccomp.h | 2 +
> > kernel/seccomp.c | 1 +
> > tools/testing/selftests/seccomp/seccomp_bpf.c | 50 +++++++++++++++++++
> > 3 files changed, 53 insertions(+)
> >
> > diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> > index c1735455bc53..f0c272ef0f1e 100644
> > --- a/include/uapi/linux/seccomp.h
> > +++ b/include/uapi/linux/seccomp.h
> > @@ -75,6 +75,8 @@ struct seccomp_notif {
> > __u32 pid;
> > __u32 flags;
> > struct seccomp_data data;
> > + __u32 tgid;
> > + __u8 pad0[4];
> > };
>
> I think we need to leave off padding and instead use __packed. If we
> don't then userspace can't tell when "pad0" changes its "meaning" (i.e.
> the size of seccomp_notif becomes 88 bytes with above -- either via
> explicit padding like you've got or via implicit by the compiler. If
> some other u32 gets added in the future, user space will still see "88"
> as the size.
>
I've had previous feedback about using "packed". See:
https://lore.kernel.org/lkml/87o8w9bcaf.fsf@xxxxxxxxxxxxxxxxx/
https://lore.kernel.org/lkml/a328b91d-fd8f-4f27-b3c2-91a9c45f18c0@xxxxxxxxxxxxxxxxxx/
> So I *think* the right change here is:
>
> -};
> + __u32 tgid;
> +} __packed;
>
> Though tgid may need to go above seccomp_data... for when it grows.
> Agh...
(How) can seccomp_data grow safely, even with this extensibility mechanism?
>
> _However_, unfortunately, I appear to have no thought this through very
> well, and there is actually no sanity-checking in the kernel for dealing
> with an old userspace when sizes change. :( For example, if a userspace
> doesn't check sizes and calls an ioctl, etc, the kernel will clobber the
> user buffer if it's too small.
>
> Even the SECCOMP_GET_NOTIF_SIZES command lacks a buffer size argument.
> :(
>
> So:
>
> - should we just declare such userspace as "wrong"? I don't think
> that'll work, especially since what if we ever change the size of
> seccomp_data... that predated the ..._SIZES command.
>
> - should we add a SECCOMP_SET_SIZES command to tell the kernel what
> we're expecting? There's no "state" associated across seccomp(2)
> calls, but maybe that doesn't matter because only user_notif writes
> back to userspace. For the ioctl, the state could be part of the
> private file data? Sending seccomp_data back to userspace only
> happens here, and any changes in seccomp_data size will just be seen
> as allowing a filter to query further into it.
Will we ever grow seccomp_data?

I suggest we throw away the _SIZES api, and just introduce RECV2, which sends
back a known, fixed format, and deprecate these dynamically sized uapi
shenanigans.

(Queue RECV3, etc..)

Maybe we do something like perf_event_open, where there's a read_format,
and that's used by the user to determine how big of a response / fields they
want to get?

>
> - should GET_SIZES report "useful" size? (i.e. exclude padding?)
>
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
>
> Yay test updates! :)
>
> > +TEST(user_notification_groupleader)
>
> In my first pass of review I was going to say "can you please also check
> the sizes used by the ioctl?" But that triggered the above size checking
> mess in my mind.
>
> Let me look at this more closely on Monday, and I'll proposed something.
> :P
To summarize my set of ideas:
1. We take the ptrace-style API, where we have a request to get the tgid of
a given request ID (or any new / extensible field)
2. We add a perf_event_open style API, where you have to tell it what fields
to include in the response
3. We introduce RECV2 [through N]
4. We never extend seccomp_data, and just continue to append things to the API
5. We rev the API _once_ and unroll seccomp_data, and make it so that
new members have to be *asked for*, rather than are implicitly
included.
>
> Thanks!
>
> -Kees
>
> --
> Kees Cook