Re: seccomp: Delay filter activation

From: Kees Cook
Date: Tue Mar 02 2021 - 02:44:59 EST


On Mon, Mar 01, 2021 at 02:21:56PM +0100, Christian Brauner wrote:
> On Mon, Mar 01, 2021 at 12:09:09PM +0100, Christian Brauner wrote:
> > On Sat, Feb 20, 2021 at 01:31:57AM -0800, Sargun Dhillon wrote:
> > > We've run into a problem where attaching a filter can be quite messy
> > > business because the filter itself intercepts sendmsg, and other
> > > syscalls related to exfiltrating the listener FD. I believe that this
> > > problem set has been brought up before, and although there are
> > > "simpler" methods of exfiltrating the listener, like clone3 or
> > > pidfd_getfd, but these are still less than ideal.

I'm trying to make sure I understand: the target process would like to
have a filter attached that blocks sendmsg, but that would mean it has
no way to send the listener FD to its manager?

And you'd want to have listening working for sendmsg (otherwise you
could do it with two filters, I imagine)?

> > int fd_filter = seccomp(SECCOMP_SET_MODE_FILTER, SECCOMP_FILTER_DETACHED, &prog);
> >
> > BARRIER_WAIT_SETUP_DONE;
> >
> > int ret = seccomp(SECCOMP_ATTACH_FILTER, 0, INT_TO_PTR(fd_listener));
>
> This obviously should've been sm like:
>
> struct seccomp_filter_attach {
> union {
> __s32 pidfd;
> __s32 pid;
> };
> __u32 fd_filter;
> };
>
> and then
>
> int ret = seccomp(SECCOMP_ATTACH_FILTER, 0, seccomp_filter_attach);

Given the difficulty with TSYNC, I'm not excited about adding an
"apply this filter to another process" API. :)

The prior thread was here:
https://lore.kernel.org/lkml/20201029075841.GB29881@ircssh-2.c.rugged-nimbus-611.internal/

But I haven't had time to follow up. Both Andy and Sargun discuss filter
"replacement", but I'm not a fan of that, since I'd really like to keep
the "additive-only" property of seccomp.

So, I'm still back to wanting an answer to my questions at the end of
https://lore.kernel.org/lkml/202010281503.3D1FCFE0@keescook/

Namely, how to best indicate the point of execution where "delayed"
filters become applied?

If we require supporting the "2b" (launched oblivious target) case
(which I think we must), we need to signal it externally, or via an
automatic trip point.

Since synchronizing with an oblivious target is rather nasty (e.g.
involving ptrace or at least ptrace access checking), I'd rather create
a predefined trip point. Having it be "execve" limits the utility of
this feature for cooperating targets, though, so I think "apply on exec"
isn't great.

struct seccomp_filter_attach_trigger {
u64 nr;
unsigned char *filter;
};

seccomp(SECCOMP_ATTACH_FILTER_TRIGGER, 0, seccomp_filter_attach_trigger);

after "nr" is evaluated (but before it runs), seccomp installs the
filter.

And by "installs", I'm not sure if it needs to keep it in a queue, with
separate ref coutning, or if it should be in the main filter stack, but
have an "alive" toggle, or what.

--
Kees Cook