Re: [RFC 1/3] seccomp: add a return code to trap to userspace

From: Tycho Andersen
Date: Wed Feb 14 2018 - 12:23:10 EST


On Wed, Feb 14, 2018 at 05:19:52PM +0000, Andy Lutomirski wrote:
> On Wed, Feb 14, 2018 at 3:29 PM, Tycho Andersen <tycho@xxxxxxxx> wrote:
> > Hey Kees,
> >
> > Thanks for taking a look!
> >
> > On Tue, Feb 13, 2018 at 01:09:20PM -0800, Kees Cook wrote:
> >> On Sun, Feb 4, 2018 at 2:49 AM, Tycho Andersen <tycho@xxxxxxxx> wrote:
> >> > This patch introduces a means for syscalls matched in seccomp to notify
> >> > some other task that a particular filter has been triggered.
> >> >
> >> > The motivation for this is primarily for use with containers. For example,
> >> > if a container does an init_module(), we obviously don't want to load this
> >> > untrusted code, which may be compiled for the wrong version of the kernel
> >> > anyway. Instead, we could parse the module image, figure out which module
> >> > the container is trying to load and load it on the host.
> >> >
> >> > As another example, containers cannot mknod(), since this checks
> >> > capable(CAP_SYS_ADMIN). However, harmless devices like /dev/null or
> >> > /dev/zero should be ok for containers to mknod, but we'd like to avoid hard
> >> > coding some whitelist in the kernel. Another example is mount(), which has
> >> > many security restrictions for good reason, but configuration or runtime
> >> > knowledge could potentially be used to relax these restrictions.
> >>
> >> Related to the eBPF seccomp thread, can the logic for these things be
> >> handled entirely by eBPF? My assumption is that you still need to stop
> >> the process to do something (i.e. do a mknod, or a mount) before
> >> letting it continue. Is there some "wait for notification" system in
> >> eBPF?
> >
> > I replied in the other thread
> > (https://patchwork.ozlabs.org/cover/872938/#1856642 for those
> > following along at home), but no, at least not that I know of.
>
> eBPF can call functions. One of those functions could put the caller
> to sleep. In fact, I think I once proposed doing this for the seccomp
> logging action as well.

Yes, true. We could always add a bpf_func_map_lookup_wait or
something. I can look into that if it's preferable.