Re: [PATCH v27 24/31] x86/cet/shstk: Handle thread shadow stack

From: H.J. Lu
Date: Wed Jul 21 2021 - 16:15:26 EST


On Wed, Jul 21, 2021 at 11:15 AM John Allen <john.allen@xxxxxxx> wrote:
>
> On Fri, May 21, 2021 at 03:12:04PM -0700, Yu-cheng Yu wrote:
> > diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
> > index 5ea2b494e9f9..8e5f772181b9 100644
> > --- a/arch/x86/kernel/shstk.c
> > +++ b/arch/x86/kernel/shstk.c
> > @@ -71,6 +71,53 @@ int shstk_setup(void)
> > return 0;
> > }
> >
> > +int shstk_alloc_thread_stack(struct task_struct *tsk, unsigned long clone_flags,
> > + unsigned long stack_size)
> > +{
> > + struct thread_shstk *shstk = &tsk->thread.shstk;
> > + struct cet_user_state *state;
> > + unsigned long addr;
> > +
> > + if (!stack_size)
> > + return -EINVAL;
>
> I've been doing some light testing on AMD hardware and I've found that
> this version of the patchset doesn't boot for me. It appears that when
> systemd processes start spawning, they hit the above case, return
> -EINVAL, and the fork fails. In these cases, copy_thread has been passed
> 0 for both sp and stack_size.
>
> For previous versions of the patchset, I can still boot. When the
> stack_size check was last, the function would always return before
> completing the check, hitting one of the two cases below.
>
> At the very least, it would seem that on some systems, it isn't valid to
> rely on the stack_size passed from clone3, though I'm unsure what the
> correct behavior should be here. If the passed stack_size == 0 and sp ==
> 0, is this a case where we want to alloc a shadow stack for this thread
> with some capped size? Alternatively, is this a case that isn't valid to
> alloc a shadow stack and we should simply return 0 instead of -EINVAL?
>
> I'm running Fedora 34 which satisfies the required versions of gcc,
> binutils, and glibc.
>
> Please let me know if there is any additional information I can provide.

FWIW, I have been maintaining stable CET kernels at:

https://github.com/hjl-tools/linux/

The current CET kernel is on hjl/cet/linux-5.13.y branch.

> Thanks,
> John
>
> > +
> > + if (!shstk->size)
> > + return 0;
> > +
> > + /*
> > + * For CLONE_VM, except vfork, the child needs a separate shadow
> > + * stack.
> > + */
> > + if ((clone_flags & (CLONE_VFORK | CLONE_VM)) != CLONE_VM)
> > + return 0;
> > +
> > + state = get_xsave_addr(&tsk->thread.fpu.state.xsave, XFEATURE_CET_USER);
> > + if (!state)
> > + return -EINVAL;
> > +
> > + /*
> > + * Compat-mode pthreads share a limited address space.
> > + * If each function call takes an average of four slots
> > + * stack space, allocate 1/4 of stack size for shadow stack.
> > + */
> > + if (in_compat_syscall())
> > + stack_size /= 4;
> > +
> > + stack_size = round_up(stack_size, PAGE_SIZE);
> > + addr = alloc_shstk(stack_size);
> > + if (IS_ERR_VALUE(addr)) {
> > + shstk->base = 0;
> > + shstk->size = 0;
> > + return PTR_ERR((void *)addr);
> > + }
> > +
> > + fpu__prepare_write(&tsk->thread.fpu);
> > + state->user_ssp = (u64)(addr + stack_size);
> > + shstk->base = addr;
> > + shstk->size = stack_size;
> > + return 0;
> > +}
> > +
> > void shstk_free(struct task_struct *tsk)
> > {
> > struct thread_shstk *shstk = &tsk->thread.shstk;
> > @@ -80,7 +127,13 @@ void shstk_free(struct task_struct *tsk)
> > !shstk->base)
> > return;
> >
> > - if (!tsk->mm)
> > + /*
> > + * When fork() with CLONE_VM fails, the child (tsk) already has a
> > + * shadow stack allocated, and exit_thread() calls this function to
> > + * free it. In this case the parent (current) and the child share
> > + * the same mm struct.
> > + */
> > + if (!tsk->mm || tsk->mm != current->mm)
> > return;
> >
> > while (1) {



--
H.J.