Re: [PATCH v2 28/39] x86/cet/shstk: Introduce map_shadow_stack syscall
From: Kees Cook
Date: Mon Oct 03 2022 - 18:27:45 EST
On Thu, Sep 29, 2022 at 03:29:25PM -0700, Rick Edgecombe wrote:
> [...]
> The following example demonstrates how to create a new shadow stack with
> map_shadow_stack:
> void *shstk = map_shadow_stack(adrr, stack_size, SHADOW_STACK_SET_TOKEN);
typo: addr
> [...]
> +451 common map_shadow_stack sys_map_shadow_stack
Isn't this "64", not "common"?
> [...]
> +#define SHADOW_STACK_SET_TOKEN 0x1 /* Set up a restore token in the shadow stack */
I think this should get an intro comment, like:
/* Flags for map_shadow_stack(2) */
Also, as with the other UAPI fields, please use "(1ULL << 0)" here.
> @@ -62,24 +63,34 @@ static int create_rstor_token(unsigned long ssp, unsigned long *token_addr)
> if (write_user_shstk_64((u64 __user *)addr, (u64)ssp))
> return -EFAULT;
>
> - *token_addr = addr;
> + if (token_addr)
> + *token_addr = addr;
>
> return 0;
> }
>
Can this just be collapsed into the patch that introduces create_rstor_token()?
> -static unsigned long alloc_shstk(unsigned long size)
> +static unsigned long alloc_shstk(unsigned long addr, unsigned long size,
> + unsigned long token_offset, bool set_res_tok)
> {
> int flags = MAP_ANONYMOUS | MAP_PRIVATE;
> struct mm_struct *mm = current->mm;
> - unsigned long addr, unused;
> + unsigned long mapped_addr, unused;
>
> mmap_write_lock(mm);
> - addr = do_mmap(NULL, addr, size, PROT_READ, flags,
Oops, I missed in the other patch that "addr" was being passed here.
(uninitialized?)
> - VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL);
> -
> + mapped_addr = do_mmap(NULL, addr, size, PROT_READ, flags,
> + VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL);
I don't see do_mmap() doing anything here to avoid remapping a prior vma
as shstk. Is the intention to allow userspace to convert existing VMAs?
This has caused pain in the past, perhaps force MAP_FIXED_NOREPLACE ?
> [...]
> @@ -174,6 +185,7 @@ int shstk_alloc_thread_stack(struct task_struct *tsk, unsigned long clone_flags,
>
>
> stack_size = PAGE_ALIGN(stack_size);
> + addr = alloc_shstk(0, stack_size, 0, false);
> if (IS_ERR_VALUE(addr))
> return PTR_ERR((void *)addr);
>
As mentioned earlier, I was expecting this patch to replace a (missing)
call to alloc_shstk. i.e. expecting:
- addr = alloc_shstk(stack_size);
> @@ -395,6 +407,26 @@ int shstk_disable(void)
> return 0;
> }
>
> +
> +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags)
Please add kern-doc for this, with some notes. E.g. at least one thing isn't immediately
obvious, maybe more: "addr" must be a multiple of 8.
> +{
> + unsigned long aligned_size;
> +
> + if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
> + return -ENOSYS;
This needs to explicitly reject unknown flags[1], or expanding them in the
future becomes very painful:
if (flags & ~(SHADOW_STACK_SET_TOKEN))
return -EINVAL;
[1] https://docs.kernel.org/process/adding-syscalls.html#designing-the-api-planning-for-extension
> +
> + /*
> + * An overflow would result in attempting to write the restore token
> + * to the wrong location. Not catastrophic, but just return the right
> + * error code and block it.
> + */
> + aligned_size = PAGE_ALIGN(size);
> + if (aligned_size < size)
> + return -EOVERFLOW;
The intention here is to allow userspace to ask for _less_ than a page
size multiple, and to put the restore token there?
Is it worth adding a check for size >= 8 here? Or, I guess it would just
immediately crash on the next call?
> +
> + return alloc_shstk(addr, aligned_size, size, flags & SHADOW_STACK_SET_TOKEN);
> +}
--
Kees Cook