Re: [PATCH 00/35] Shadow stacks for userspace

From: Edgecombe, Rick P
Date: Thu Feb 03 2022 - 20:08:36 EST


Hi Thomas,

Thanks for feedback on the plan.

On Thu, 2022-02-03 at 22:07 +0100, Thomas Gleixner wrote:
> > Until now, the enabling effort was trying to support both Shadow
> > Stack and IBT.
> > This history will focus on a few areas of the shadow stack
> > development history
> > that I thought stood out.
> >
> > Signals
> > -------
> > Originally signals placed the location of the shadow stack
> > restore
> > token inside the saved state on the stack. This was
> > problematic from a
> > past ABI promises perspective. So the restore location was
> > instead just
> > assumed from the shadow stack pointer. This works because in
> > normal
> > allowed cases of calling sigreturn, the shadow stack pointer
> > should be
> > right at the restore token at that time. There is no
> > alternate shadow
> > stack support. If an alt shadow stack is added later we
> > would
> > need to
>
> So how is that going to work? altstack is not an esoteric corner
> case.

My understanding is that the main usages for the signal stack were
handling stack overflows and corruption. Since the shadow stack only
contains return addresses rather than large stack allocations, and is
not generally writable or pivotable, I thought there was a good
possibility an alt shadow stack would not end up being especially
useful. Does it seem like reasonable guesswork?

If it does seems likely, I'll give it some more thought than that hand
wavy plan.

>
> > Limit to only Intel Processors
> > ------------------------------
> > Shadow stack is supported on some AMD processors, but this
> > revision
> > (with expanded HW usage and xsaves changes) has only has
> > been tested on
> > Intel ones. So this series has a patch to limit shadow stack
> > support to
> > Intel processors. Ideally the patch would not even make it
> > to mainline,
> > and should be dropped as soon as this testing is done. It's
> > included
> > just in case.
>
> Ha. I can give you access to an AMD machine with CET SS supported :)

Thanks for the offer. It sounds like John Allen can do this testing.

>
> > Future Work
> > ===========
> > Even though this is now exclusively a shadow stack series, there is
> > still some
> > remaining shadow stack work to be done.
> >
> > Ptrace
> > ------
> > Early in the series, there was a patch to allow IA32_U_CET
> > and
> > IA32_PL3_SSP to be set. This patch was dropped and planned
> > as a follow
> > up to basic support, and it remains the plan. It will be
> > needed for
> > in-progress gdb support.
>
> It's pretty much a prerequisite for enabling it, right?

Yes.

>
> > CRIU Support
> > ------------
> > In the past there was some speculation on the mailing list
> > about
> > whether CRIU would need to be taught about CET. It turns
> > out, it does.
> > The first issue hit is that CRIU calls sigreturn directly
> > from its
> > “parasite code” that it injects into the dumper process.
> > This violates
> > this shadow stack implementation’s protection that intends
> > to prevent
> > attackers from doing this.
> >
> > With so many packages already enabled with shadow stack,
> > there is
> > probably desire to make it work seamlessly. But in the
> > meantime if
> > distros want to support shadow stack and CRIU, users could
> > manually
> > disabled shadow stack via
> > “GLIBC_TUNABLES=glibc.cpu.x86_shstk=off” for
> > a process they will wants to dump. It’s not ideal.
> >
> > I’d like to hear what people think about having shadow stack
> > in the
> > kernel without this resolved. Nothing would change for any
> > users until
> > they enable shadow stack in the kernel and update to a glibc
> > configured
> > with CET. Should CRIU userspace be solved before kernel
> > support?
>
> Definitely yes. Making CRIU users add a glibc tunable is not really
> an
> option. We can't break CRIU systems with a kernel upgrade.

Ok got it, thanks. Just to be clear though, existing distros/binaries
out there will not have shadow stack enabled with just an updated
kernel (due to the enabling changes). So the CRIU tools would only
break after future glibc binaries enable CET, which users/distros would
have to do specifically (glibc doesn't even enable cet by default).

Since the purpose of this feature is to restrict previously allowed
behaviors, and it’s apparently getting enabled by default in some
distro's packages, I guess there is a decent chance that once a system
is updated with a future glibc some app somewhere will break. I was
under the impression that as long as there were no breakages under a
current set of binaries (including glibc), this was not considered a
kernel regression. Please correct me if this is wrong. I think there
are other options if we want to make this softer.

Of course none of that prevents known breakages from being fixed for
normal reasons and I’ll look into that for CRIU.