Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system callfiltering
From: Ingo Molnar
Date: Wed May 25 2011 - 15:09:34 EST
* Kees Cook <kees.cook@xxxxxxxxxxxxx> wrote:
> [CC trimmed, as recommended]
>
> Hi,
>
> On Tue, May 24, 2011 at 10:08:15PM +0200, Ingo Molnar wrote:
> > * Will Drewry <wad@xxxxxxxxxxxx> wrote:
> >
> > > The change avoids defining a new trace call type or a huge number of internal
> > > changes and hides seccomp.mode=2 from ABI-exposure in prctl, but the attack
> > > surface is non-trivial to verify, and I'm not sure if this ABI change makes
> > > sense. It amounts to:
> > >
> > > include/linux/ftrace_event.h | 4 +-
> > > include/linux/perf_event.h | 10 +++++---
> > > kernel/perf_event.c | 49 +++++++++++++++++++++++++++++++++++++---
> > > kernel/seccomp.c | 8 ++++++
> > > kernel/trace/trace_syscalls.c | 27 +++++++++++++++++-----
> > > 5 files changed, 82 insertions(+), 16 deletions(-)
> > >
> > > And can be found here: http://static.dataspill.org/perf_secure/v1/
> >
> > Wow, i'm very impressed how few changes you needed to do to support this!
> > [...]
> > attr.require_secure: this is basically used to *force* the creation of
> > security-controlling filters, right? It seems to me that this could be done via
> > a seccomp ABI extension as well, without adding this to the perf ABI. That
> > seccomp call could check whether the right events are created and move the task
> > to mode 2 only if that prereq is met - or something like that.
>
> I understood the prctl() API that was outlined earlier, but it
> seems this is not going to happen now. What would the programming
> API actually look like for an application developer using this
> perf-style method?
Well, this API is probably not going to happen either ;-)
The way is to create a perf event and install a filter by passing the
filters ASCII string as a pointer to the kernel, using
PERF_EVENT_IOC_SET_FILTER on the event fd. If this ever gets used
seriously then it should probably move into its own system call - but
that is a detail.
Installing a filter can be safely done by unprivileged user-space
(the kernel checks it), and they get inherited across fork(), are
properly per task, etc.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/