Re: [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole

From: Pavel Machek
Date: Mon May 11 2009 - 08:15:46 EST


On Thu 2009-05-07 12:11:29, Ingo Molnar wrote:
>
> * Nicholas Miell <nmiell@xxxxxxxxxxx> wrote:
>
> > On Wed, 2009-05-06 at 15:21 -0700, Markus Gutschke (?????????) wrote:
> > > On Wed, May 6, 2009 at 15:13, Ingo Molnar <mingo@xxxxxxx> wrote:
> > > > doing a (per arch) bitmap of harmless syscalls and replacing the
> > > > mode1_syscalls[] check with that in kernel/seccomp.c would be a
> > > > pretty reasonable extension. (.config controllable perhaps, for
> > > > old-style-seccomp)
> > > >
> > > > It would probably be faster than the current loop over
> > > > mode1_syscalls[] as well.
> > >
> > > This would be a great option to improve performance of our sandbox. I
> > > can detect the availability of the new kernel API dynamically, and
> > > then not intercept the bulk of the system calls. This would allow the
> > > sandbox to work both with existing and with newer kernels.
> > >
> > > We'll post a kernel patch for discussion in the next few days,
> > >
> >
> > I suspect the correct thing to do would be to leave seccomp mode 1
> > alone and introduce a mode 2 with a less restricted set of system
> > calls -- the interface was designed to be extended in this way,
> > after all.
>
> Yes, that is what i alluded to above via the '.config controllable'
> aspect.
>
> Mode 2 could be implemented like this: extend prctl_set_seccomp()
> with a bitmap pointer, and copy it to a per task seccomp context
> structure.
>
> a bitmap for 300 syscalls takes only about 40 bytes.
>
> Please take care to implement nesting properly: if a seccomp context
> does a seccomp call (which mode 2 could allow), then the resulting
> bitmap should be the logical-AND of the parent and child bitmaps.
> There's no reason why seccomp couldnt be used in hiearachy of
> sandboxes, in a gradually less permissive fashion.

I don't think seccomp nesting (at kernel level) has any value.

First, syscalls are wrong level of abstraction for sandboxing. There
are multiple ways to read from file, for example.

If you wanted to do hierarchical sandboxes, asking your monitor to
restrict your seccomp mask would seem like a way to go...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/