Re: [RFC PATCH v2 17/27] x86/cet/shstk: User-mode shadow stack support

From: Jann Horn
Date: Wed Jul 11 2018 - 17:10:53 EST


On Tue, Jul 10, 2018 at 3:31 PM Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx> wrote:
>
> This patch adds basic shadow stack enabling/disabling routines.
> A task's shadow stack is allocated from memory with VM_SHSTK
> flag set and read-only protection. The shadow stack is
> allocated to a fixed size.
>
> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx>
[...]
> diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c
> new file mode 100644
> index 000000000000..96bf69db7da7
> --- /dev/null
> +++ b/arch/x86/kernel/cet.c
[...]
> +static unsigned long shstk_mmap(unsigned long addr, unsigned long len)
> +{
> + struct mm_struct *mm = current->mm;
> + unsigned long populate;
> +
> + down_write(&mm->mmap_sem);
> + addr = do_mmap(NULL, addr, len, PROT_READ,
> + MAP_ANONYMOUS | MAP_PRIVATE, VM_SHSTK,
> + 0, &populate, NULL);
> + up_write(&mm->mmap_sem);
> +
> + if (populate)
> + mm_populate(addr, populate);
> +
> + return addr;
> +}

How does this interact with UFFDIO_REGISTER?

Is there an explicit design decision on whether FOLL_FORCE should be
able to write to shadow stacks? I'm guessing the answer is "yes,
FOLL_FORCE should be able to write to shadow stacks"? It might make
sense to add documentation for this.

Should the kernel enforce that two shadow stacks must have a guard
page between them so that they can not be directly adjacent, so that
if you have too much recursion, you can't end up corrupting an
adjacent shadow stack?

> +int cet_setup_shstk(void)
> +{
> + unsigned long addr, size;
> +
> + if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
> + return -EOPNOTSUPP;
> +
> + size = in_ia32_syscall() ? SHSTK_SIZE_32:SHSTK_SIZE_64;
> + addr = shstk_mmap(0, size);
> +
> + /*
> + * Return actual error from do_mmap().
> + */
> + if (addr >= TASK_SIZE_MAX)
> + return addr;
> +
> + set_shstk_ptr(addr + size - sizeof(u64));
> + current->thread.cet.shstk_base = addr;
> + current->thread.cet.shstk_size = size;
> + current->thread.cet.shstk_enabled = 1;
> + return 0;
> +}
[...]
> +void cet_disable_free_shstk(struct task_struct *tsk)
> +{
> + if (!cpu_feature_enabled(X86_FEATURE_SHSTK) ||
> + !tsk->thread.cet.shstk_enabled)
> + return;
> +
> + if (tsk == current)
> + cet_disable_shstk();
> +
> + /*
> + * Free only when tsk is current or shares mm
> + * with current but has its own shstk.
> + */
> + if (tsk->mm && (tsk->mm == current->mm) &&
> + (tsk->thread.cet.shstk_base)) {
> + vm_munmap(tsk->thread.cet.shstk_base,
> + tsk->thread.cet.shstk_size);
> + tsk->thread.cet.shstk_base = 0;
> + tsk->thread.cet.shstk_size = 0;
> + }
> +
> + tsk->thread.cet.shstk_enabled = 0;
> +}