Re: [PATCH 00/35] Shadow stacks for userspace

From: Andy Lutomirski
Date: Wed Feb 09 2022 - 21:38:17 EST


On 2/8/22 18:18, Edgecombe, Rick P wrote:
On Tue, 2022-02-08 at 20:02 +0300, Cyrill Gorcunov wrote:
On Tue, Feb 08, 2022 at 08:21:20AM -0800, Andy Lutomirski wrote:
But such a knob will immediately reduce the security value of
the entire
thing, and I don't have good ideas how to deal with it :(

Probably a kind of latch in the task_struct which would trigger
off once
returt to a different address happened, thus we would be able to
jump inside
paratite code. Of course such trigger should be available under
proper
capability only.

I'm not fully in touch with how parasite, etc works. Are we
talking about save or restore?

We use parasite code in question during checkpoint phase as far as I
remember.
push addr/lret trick is used to run "injected" code (code injection
itself is
done via ptrace) in compat mode at least. Dima, Andrei, I didn't look
into this code
for years already, do we still need to support compat mode at all?

If it's restore, what exactly does CRIU need to do? Is it just
that CRIU needs to return
out from its resume code into the to-be-resumed program without
tripping CET? Would it
be acceptable for CRIU to require that at least one shstk slot be
free at save time?
Or do we need a mechanism to atomically switch to a completely full
shadow stack at resume?

Off the top of my head, a sigreturn (or sigreturn-like mechanism)
that is intended for
use for altshadowstack could safely verify a token on the
altshdowstack, possibly
compare to something in ucontext (or not -- this isn't clearly
necessary) and switch
back to the previous stack. CRIU could use that too. Obviously
CRIU will need a way
to populate the relevant stacks, but WRUSS can be used for that,
and I think this
is a fundamental requirement for CRIU -- CRIU restore absolutely
needs a way to write
the saved shadow stack data into the shadow stack.

Still wrapping my head around the CRIU save and restore steps, but
another general approach might be to give ptrace the ability to
temporarily pause/resume/set CET enablement and SSP for a stopped
thread. Then injected code doesn't need to jump through any hoops or
possibly run into road blocks. I'm not sure how much this opens things
up if the thread has to be stopped...

Hmm, that's maybe not insane.

An alternative would be to add a bona fide ptrace call-a-function mechanism. I can think of two potentially usable variants:

1. Straight call. PTRACE_CALL_FUNCTION(addr) just emulates CALL addr, shadow stack push and all.

2. Signal-style. PTRACE_CALL_FUNCTION_SIGFRAME injects an actual signal frame just like a real signal is being delivered with the specified handler. There could be a variant to opt-in to also using a specified altstack and altshadowstack.

2 would be more expensive but would avoid the need for much in the way of asm magic. The injected code could be plain C (or Rust or Zig or whatever).

All of this only really handles save, not restore. I don't understand restore enough to fully understand the issue.

--Andy