Re: [PATCH v2 seccomp 1/6] seccomp: Move config option SECCOMP to arch/Kconfig

From: Jann Horn
Date: Wed Oct 28 2020 - 19:25:57 EST


On Wed, Oct 28, 2020 at 9:19 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> On Wed, Oct 28, 2020 at 1:06 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> > On Tue, Oct 27, 2020 at 10:52:39AM +0100, Geert Uytterhoeven wrote:
> > > On Thu, Sep 24, 2020 at 2:48 PM YiFei Zhu <zhuyifei1999@xxxxxxxxx> wrote:
> > > > From: YiFei Zhu <yifeifz2@xxxxxxxxxxxx>
> > > >
> > > > In order to make adding configurable features into seccomp
> > > > easier, it's better to have the options at one single location,
> > > > considering easpecially that the bulk of seccomp code is
> > > > arch-independent. An quick look also show that many SECCOMP
> > > > descriptions are outdated; they talk about /proc rather than
> > > > prctl.
> > > >
> > > > As a result of moving the config option and keeping it default
> > > > on, architectures arm, arm64, csky, riscv, sh, and xtensa
> > > > did not have SECCOMP on by default prior to this and SECCOMP will
> > > > be default in this change.
> > > >
> > > > Architectures microblaze, mips, powerpc, s390, sh, and sparc
> > > > have an outdated depend on PROC_FS and this dependency is removed
> > > > in this change.
> > > >
> > > > Suggested-by: Jann Horn <jannh@xxxxxxxxxx>
> > > > Link: https://lore.kernel.org/lkml/CAG48ez1YWz9cnp08UZgeieYRhHdqh-ch7aNwc4JRBnGyrmgfMg@xxxxxxxxxxxxxx/
> > > > Signed-off-by: YiFei Zhu <yifeifz2@xxxxxxxxxxxx>
> > >
> > > Thanks for your patch. which is now commit 282a181b1a0d66de ("seccomp:
> > > Move config option SECCOMP to arch/Kconfig") in v5.10-rc1.
> > >
> > > > --- a/arch/Kconfig
> > > > +++ b/arch/Kconfig
> > > > @@ -458,6 +462,23 @@ config HAVE_ARCH_SECCOMP_FILTER
> > > > results in the system call being skipped immediately.
> > > > - seccomp syscall wired up
> > > >
> > > > +config SECCOMP
> > > > + def_bool y
> > > > + depends on HAVE_ARCH_SECCOMP
> > > > + prompt "Enable seccomp to safely compute untrusted bytecode"
> > > > + help
> > > > + This kernel feature is useful for number crunching applications
> > > > + that may need to compute untrusted bytecode during their
> > > > + execution. By using pipes or other transports made available to
> > > > + the process as file descriptors supporting the read/write
> > > > + syscalls, it's possible to isolate those applications in
> > > > + their own address space using seccomp. Once seccomp is
> > > > + enabled via prctl(PR_SET_SECCOMP), it cannot be disabled
> > > > + and the task is only allowed to execute a few safe syscalls
> > > > + defined by each seccomp mode.
> > > > +
> > > > + If unsure, say Y. Only embedded should say N here.
> > > > +
> > >
> > > Please tell me why SECCOMP is special, and deserves to default to be
> > > enabled. Is it really that critical, given only 13.5 (half of sparc
> > > ;-) out of 24
> > > architectures implement support for it?
> >
> > That's an excellent point; I missed this in my review as I saw several
> > Kconfig already marked "def_bool y" but failed to note it wasn't _all_
> > of them. Okay, checking before this patch, these had them effectively
> > enabled:
> >
> > via Kconfig:
> >
> > parisc
> > s390
> > um
> > x86
>
> Mostly "server" and "desktop" platforms.
>
> > via defconfig, roughly speaking:
> >
> > arm
> > arm64
> > sh
>
> Note that these defconfigs are example configs, not meant for production.
> E.g. arm/multi_v7_defconfig and arm64/defconfig enable about everything
> for compile coverage.
>
> > How about making the default depend on HAVE_ARCH_SECCOMP_FILTER?
> >
> > These have SECCOMP_FILTER support:
> >
> > arch/arm/Kconfig: select HAVE_ARCH_SECCOMP_FILTER if AEABI && !OABI_COMPAT
> > arch/arm64/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/csky/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/mips/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/parisc/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/powerpc/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/riscv/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/s390/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/sh/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/um/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/x86/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> > arch/xtensa/Kconfig: select HAVE_ARCH_SECCOMP_FILTER
> >
> > So the "new" promotions would be:
> >
> > csky
> > mips
> > powerpc
> > riscv
> > xtensa
> >
> > Which would leave only these two:
> >
> > arch/microblaze/Kconfig: select HAVE_ARCH_SECCOMP
> > arch/sparc/Kconfig: select HAVE_ARCH_SECCOMP if SPARC64
> >
> > At this point, given the ubiquity of seccomp usage (e.g. systemd), I
> > guess it's not unreasonable to make it def_bool y?
>
> Having support does not necessarily imply you want it enabled.
> If systemd needs it (does it? I have Debian nfsroots with systemd,
> without SECCOMP), you can enable it in the defconfig.
> "Default y" is for things you cannot do without, unless you know
> better.
>
> Bloat-o-meter says enabling SECCOMP consumes only ca. 8 KiB
> (on arm32), so perhaps "default y if !EXPERT"?

Gating a *default* on EXPERT seems weird to me. Isn't EXPERT normally
used to gate whether things are configurable at all (using "if
EXPERT")?

I think that at least on systems with MMU, SECCOMP should default to
y, independent of what EXPERT is set to. When SECCOMP is disabled,
various pieces of software will have to (potentially invisibly to the
user) degrade their belts-and-suspenders security measures. For
example, as far as I understand, systemd has support for using seccomp
to restrict what services can do (and uses that for some of its
built-in services), but skips those steps with a log message if you
don't have SECCOMP. Perhaps more importantly, the same thing happens
in OpenSSH's ssh_sandbox_child() function - it generates a debug
message, then continues on.

If someone does manage to find an OpenSSH pre-auth remote code
execution bug in a few years, I think we very much wouldn't want to be
in a situation where that can be used to compromise a bunch of routers
just because SECCOMP wasn't in the default config, or because it was
invisibly disabled when the router vendor enabled EXPERT so that they
can get rid of io_uring support.