Re: [PATCH v9 05/13] seccomp_filter: Document what seccomp_filter isand how it works.

From: Will Drewry
Date: Fri Jul 01 2011 - 11:46:32 EST


On Fri, Jul 1, 2011 at 8:07 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * Will Drewry <wad@xxxxxxxxxxxx> wrote:
>
>> On Fri, Jul 1, 2011 at 6:56 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
>> >
>> > * James Morris <jmorris@xxxxxxxxx> wrote:
>> >
>> >> On Wed, 29 Jun 2011, Will Drewry wrote:
>> >>
>> >> > Since it seems that there'll be consumers (openssh, vsftpd,
>> >> > kvm/qemu, chromium, chromium os) and feedback quieted down, what
>> >> > are the next steps to get this to a pull/no-pull decision points
>> >> > (or at least some Ack's or Nack's)?  I know this patch series
>> >> > crosses a number of maintainers, and I never know exactly what's
>> >> > next when the feedback slows down.
>> >>
>> >> Are there any outstanding objections to this approach?  How do the
>> >> tracing folk feel about it?
>> >
>> > I think i outlined my objections a couple of times and haven't seen
>> > them addressed.
>>
>> After our last discussion, I suggested changes which I then undertook
>> and reposted.  Those changes have been posted for over two weeks.
>
> Have you addressed my basic objection of why we should go for a more
> complex and less capable variant over a shared yet more capable
> facility:
>
>  http://lkml.kernel.org/r/20110526091518.GE26775@xxxxxxx
>
> ?

I withheld a fair number of comments because of the other participants
addressed some of them, but I believe I've laid out my thoughts at
least twice. Perhaps not clearly, though. I can post links if you
like, but they were all direct responses to your posts.

> You are pushing the 'filter engine' approach currently, not the
> (much) more unified 'event filters' approach.

In short, it is a large amount of work that will provide an incomplete
solution. As is, this patch series has been in process for over 8
weeks. I've rewritten it about twice as many times as the 'v9' label
indicates. After doing that, I do not believe your proposed solution
is simpler or more reasonable in the near term (<= 2 years). Perhaps
the issue is just that someone who is more skilled and more
comfortable in the kernel could propose a better patch series - I
don't know.



>From my view, ftrace events are not ready for the job yet - and
relying purely on available wrapped events may make it unsuitable for
attack surface reduction forever. As is, there is no compat syscall
support. Many syscalls are not wrapped at present and no one ack'd my
earlier patches around wrapping more. All of perf needs to be
overhauled to share per-task infrastructure. A new ABI needs to be
proposed if my prctl() changes are not acceptable to handle some of
the security-focused behavioral requirements. Performance
characteristics need to be better analyzed as the current perf
list_head approach may not scale as desired. The list goes on. My
proof of concept patch for "event filters" was just that - a proof of
concept. To truly share the filter events is a large amount of work
that may not be viable, and I believe you know that as well as I do.

That said, I did attempt to internalize all of your feedback, the
feedback from potential consumers, Linus' feedback, and that from
Steve and Frederic, along with the other people who were so kind as to
provide excellent feedback. The current patch series is forward
portable to an "event filters" model and nearly all of its per-task
management complexity could be folded into shared code if it is
possible to create such infrastructure (for perf, event filters, ?).
Alternatively, this interface could become the entry point until such
time as it outgrows a per-process only experience.

Based on the support from potential API consumers, I believe there is
interest in this patch series, and I worry that just like with the
last two attempts in the last two years, this series will be relegated
to the lwn archives in anticipation of a future solution that uses
infrastructure that isn't quite ready. I'm trying to approach a
problem that can be addressed today in a flexible, future-friendly
way, rather than try to open up a larger cross-kernel impacting patch
series that I'm unsure of exactly how to integrate sanely and don't
know that I can commit to doing. And I don't see anyone else jumping
up to write the patch series to do it either :/



Anyway, that's where I am. I'd really like to be able to offer this
functionality for all Linux users of Chromium and I'd be thrilled to
see it helping enhance the privilege separation code in OpenSSH, among
the other projects that have come forward (qemu, etc).

If you still think this patch series is a "stupid, limited hack" even
my attempts to balance the present with a possible future, then I'm
not sure what the practical next steps are. And even if it is such a
hack, it is isolated to a very specific purpose and does not impact
those who don't care about it. I don't think there's an attainable,
perfect solution, and when it comes to attack surface reduction, it's
hard to be hooking the syscalls very explicitly.

Thanks again for the continued feedback,
will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/