Re: [PATCH v7 01/41] Documentation/x86: Add CET shadow stack description

From: szabolcs.nagy@xxxxxxx
Date: Mon Mar 06 2023 - 11:39:51 EST


The 03/03/2023 22:35, Edgecombe, Rick P wrote:
> I think I slightly prefer the former arch_prctl() based solution for a
> few reasons:
> - When you need to find the start or end of the shadow stack can you
> can just ask for it instead of searching. It can be faster and simpler.
> - It saves 8 bytes of memory per shadow stack.
>
> If this turns out to be wrong and we want to do the marker solution
> much later at some point, the safest option would probably be to create
> new flags.

i see two problems with a get bounds syscall:

- syscall overhead.

- discontinous shadow stack (e.g. alt shadow stack ends with a
pointer to the interrupted thread shadow stack, so stack trace
can continue there, except you don't know the bounds of that).

> But just discussing this with HJ, can you share more on what the usage
> is? Like which backtracing operation specifically needs the marker? How
> much does it care about the ucontext case?

it could be an option for perf or ptracers to sample the stack trace.

in-process collection of stack trace for profiling or crash reporting
(e.g. when stack is corrupted) or cross checking stack integrity may
use it too.

sometimes parsing /proc/self/smaps maybe enough, but the idea was to
enable light-weight backtrace collection in an async-signal-safe way.

syscall overhead in case of frequent stack trace collection can be
avoided by caching (in tls) when ssp falls within the thread shadow
stack bounds. otherwise caching does not work as the shadow stack may
be reused (alt shadow stack or ucontext case).

unfortunately i don't know if syscall overhead is actually a problem
(probably not) or if backtrace across signal handlers need to work
with alt shadow stack (i guess it should work for crash reporting).

thanks.