Re: [PATCH 3/7] seccomp_filter: Enable ftrace-based system callfiltering

From: Frederic Weisbecker
Date: Thu Apr 28 2011 - 11:58:10 EST


On Thu, Apr 28, 2011 at 10:15:04AM -0500, Will Drewry wrote:
> On Thu, Apr 28, 2011 at 9:29 AM, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> > On Wed, Apr 27, 2011 at 10:08:47PM -0500, Will Drewry wrote:
> >> This change adds a new seccomp mode based on the work by
> >> agl@xxxxxxxxxxxxx This mode comes with a bitmask of NR_syscalls size and
> >> an optional linked list of seccomp_filter objects. When in mode 2, all
> >
> > Since you now use the filters. Why not using them to filter syscalls
> > entirely rather than using a bitmap of allowed syscalls?
>
> The current approach just uses a linked list of filters. While a more
> efficient data structure could be used, the bitmask provides a quick
> binary decision, and optimizes for the relatively common case where
> there won't be many non-binary filters to evaluate so we don't have to
> walk the list for a larger number of yes/no decisions versus more
> complex predicates. Though that may be a short-sighted view! I'm
> happy to change it up.

Well, using a hlist that points to the filters may be not that slower.
Dunno, that needs to be measured perhaps.

No big deal for now.

>
> > You have the "nr" field in syscall tracepoints.
>
> I'n not sure I follow. Do you mean moving entirely to using the
> actual tracepoint infrastructure instead of using the seccomp hooks,
> or just looking up proper filter by syscall nr? If there's a sane and
> better way to do the latter, I'm all ears :) As far as using the
> tracepoints themselves, I looked to how the perf/ftrace interactions
> worked and while I could've registered with the syscalls tracepoints
> for enter and exit, it would mean later evaluation of the system call
> interception, possibly out-of-order with respect to other registered
> event sinks, and there is complexity in just killing current from
> within the notifier-like list registered syscall events (as Eric Paris
> ran into when expanding filtering into perf itself). To get around
> that, the tracepoint handler would have to pump the data somewhere
> else (like it does for perf), and it just seemed messy. I think it's
> doable, but I don't know that the pure syscall tracepoint
> infrastructure should be burdened with the added requirements that
> come with seccomp-filtering. If I didn't properly understand the
> code, though, please set me on the right path.

No, my bad I was confused. I always post questions that show my
misunderstanding of a new (or not) patchset. It's like a tradition ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/