Re: [PATCH 3/8] seccomp: Introduce SECCOMP_PIN_ARCHITECTURE

From: Jann Horn
Date: Wed Jun 17 2020 - 11:25:41 EST


On Tue, Jun 16, 2020 at 9:49 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> For systems that provide multiple syscall maps based on architectures
> (e.g. AUDIT_ARCH_X86_64 and AUDIT_ARCH_I386 via CONFIG_COMPAT), allow
> a fast way to pin the process to a specific syscall mapping, instead of
> needing to generate all filters with an architecture check as the first
> filter action.

This seems reasonable; but can we maybe also add X86-specific handling
for that X32 mess? AFAIK there are four ways to do syscalls with
AUDIT_ARCH_X86_64:

1. normal x86-64 syscall, X32 bit unset (native case)
2. normal x86-64 syscall, X32 bit set (for X32 code calling syscalls
with no special X32 version)
3. x32-specific syscall, X32 bit unset (never happens legitimately)
4. x32-specific syscall, X32 bit set (for X32 code calling syscalls
with special X32 version)

(I got this wrong when I wrote the notes on x32 in the seccomp manpage...)

Can we add a flag for AUDIT_ARCH_X86_64 that says either "I want
native x64-64" (enforcing case 1) or "I want X32" (enforcing case 2 or
4, and in case 2 checking that the syscall has no X32 equivalent)? (Of
course, if the kernel is built without X32 support, we can leave out
these extra checks.)

> +static long seccomp_pin_architecture(void)
> +{
> +#ifdef CONFIG_COMPAT
> + u32 arch = syscall_get_arch(current);
> +
> + /* How did you even get here? */
> + if (current->seccomp.arch && current->seccomp.arch != arch)
> + return -EBUSY;
> +
> + current->seccomp.arch = arch;
> +#endif
> + return 0;
> +}

Are you intentionally writing this such that SECCOMP_PIN_ARCHITECTURE
only has an effect once you've installed a filter, and propagation to
other threads happens when a filter is installed with TSYNC? I guess
that is a possible way to design the API, but it seems like something
that should at least be pointed out explicitly.