Re: [PATCH v13 00/12] add support for Clang's Shadow Call Stack

From: Ard Biesheuvel
Date: Mon Apr 27 2020 - 13:39:42 EST


On Mon, 27 Apr 2020 at 18:00, Sami Tolvanen <samitolvanen@xxxxxxxxxx> wrote:
>
> This patch series adds support for Clang's Shadow Call Stack
> (SCS) mitigation, which uses a separately allocated shadow stack
> to protect against return address overwrites. More information
> can be found here:
>
> https://clang.llvm.org/docs/ShadowCallStack.html
>
> SCS provides better protection against traditional buffer
> overflows than CONFIG_STACKPROTECTOR_*, but it should be noted
> that SCS security guarantees in the kernel differ from the ones
> documented for user space. The kernel must store addresses of
> shadow stacks in memory, which means an attacker capable of
> reading and writing arbitrary memory may be able to locate them
> and hijack control flow by modifying the shadow stacks.
>
> SCS is currently supported only on arm64, where the compiler
> requires the x18 register to be reserved for holding the current
> task's shadow stack pointer.
>
> With -fsanitize=shadow-call-stack, the compiler injects
> instructions to all non-leaf C functions to store the return
> address to the shadow stack, and unconditionally load it again
> before returning. As a result, SCS is incompatible with features
> that rely on modifying function return addresses in the kernel
> stack to alter control flow. A copy of the return address is
> still kept in the kernel stack for compatibility with stack
> unwinding, for example.
>
> SCS has a minimal performance overhead, but allocating
> shadow stacks increases kernel memory usage. The feature is
> therefore mostly useful on hardware that lacks support for PAC
> instructions.
>
> Changes in v13:
> - Changed thread_info::shadow_call_stack to a base address and
> an offset instead, and removed the now unneeded __scs_base()
> and scs_save().
> - Removed alignment from the kmem_cache and static allocations.
> - Removed the task_set_scs() helper function.
> - Moved the assembly code for loading and storing the offset in
> thread_info to scs_load/save macros.
> - Added offset checking to scs_corrupted().
> - Switched to cmpxchg_relaxed() in scs_check_usage().
>

OK, so one thing that came up in an offline discussion about SCS is
the way it interacts with the vmap'ed stack.

The vmap'ed stack is great for robustness, but it only works if things
don't explode for other reasons in the mean time. This means the
ordinary-to-shadow-call-stack size ratio should be chosen such that it
is *really* unlikely you could ever overflow the shadow call stack and
corrupt another task's call stack before hitting the vmap stack's
guard region.

Alternatively, I wonder if there is a way we could let the SCS and
ordinary stack share the [bottom of] the vmap'ed region. That would
give rather nasty results if the ordinary stack overflows into the
SCS, but for cases where we really recurse out of control, we could
catch this occurrence on either stack, whichever one occurs first. And
the nastiness -when it does occur- will not corrupt any state beyond
the stack of the current task.