Re: [PATCH RFC/RFT 3/3] kernel: converge common shadow stack flow agnostic to arch

From: Deepak Gupta
Date: Fri Oct 11 2024 - 13:06:11 EST


On Fri, Oct 11, 2024 at 01:33:05PM +0100, Mark Brown wrote:
On Thu, Oct 10, 2024 at 05:32:05PM -0700, Deepak Gupta wrote:

+unsigned long alloc_shstk(unsigned long addr, unsigned long size,
+ unsigned long token_offset, bool set_res_tok);
+int shstk_setup(void);
+int create_rstor_token(unsigned long ssp, unsigned long *token_addr);
+bool cpu_supports_shadow_stack(void);

The cpu_ naming is confusing in an arm64 context, we use cpu_ for
functions that report if a feature is supported on the current CPU and
system_ for functions that report if a feature is enabled on the system.


hmm...
Curious. What's the difference between cpu and system?
We can ditch both cpu and system and call it
`user_shstk_supported()`. Again not a great name but all we are looking for
is whether user shadow stack is supported or not.

+void set_thread_shstk_status(bool enable);

It might be better if this took the flags that the prctl() takes? It
feels like

hmm we can do that. But I imagine it might get invoked from other flow as well.
Although instead of `bool`, we can take `unsigned long` here. It would work for now
for `prctl` and future users get options to chisel around it.
I'll do that.


+/* Flags for map_shadow_stack(2) */
+#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */
+

We've also got SHADOW_STACK_SET_MARKER now.


Sorry, I missed it.

+bool cpu_supports_shadow_stack(void)
+{
+ return arch_cpu_supports_shadow_stack();
+}
+
+bool is_shstk_enabled(struct task_struct *task)
+{
+ return arch_is_shstk_enabled(task);
+}

Do we need these wrappers (or could they just be static inlines in the
header)?

As I started doing this exercise, I ran into some headers and multiple
definitions issue due to both modules (arch specific shstk.c and generic
usershstk.c) need to call into each other independently and need to include
each other headers, so took path of least resistence.

But now that it settling a bit and I've better picture, I'll give it a try
again.


+void set_thread_shstk_status(bool enable)
+{
+ arch_set_thread_shstk_status(enable);
+}

arm64 can return an error here, we reject a bunch of conditions like 32
bit threads and locked enable status.

Ok.
You would like this error to be `bool` or an `int`?


+unsigned long adjust_shstk_size(unsigned long size)
+{
+ if (size)
+ return PAGE_ALIGN(size);
+
+ return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G));
+}

static?

Yes this can be static.


+/*
+ * VM_SHADOW_STACK will have a guard page. This helps userspace protect
+ * itself from attacks. The reasoning is as follows:
+ *
+ * The shadow stack pointer(SSP) is moved by CALL, RET, and INCSSPQ. The
+ * INCSSP instruction can increment the shadow stack pointer. It is the
+ * shadow stack analog of an instruction like:
+ *
+ * addq $0x80, %rsp
+ *
+ * However, there is one important difference between an ADD on %rsp
+ * and INCSSP. In addition to modifying SSP, INCSSP also reads from the
+ * memory of the first and last elements that were "popped". It can be
+ * thought of as acting like this:
+ *
+ * READ_ONCE(ssp); // read+discard top element on stack
+ * ssp += nr_to_pop * 8; // move the shadow stack
+ * READ_ONCE(ssp-8); // read+discard last popped stack element
+ *
+ * The maximum distance INCSSP can move the SSP is 2040 bytes, before
+ * it would read the memory. Therefore a single page gap will be enough
+ * to prevent any operation from shifting the SSP to an adjacent stack,
+ * since it would have to land in the gap at least once, causing a
+ * fault.

This is all very x86 centric...

Yes I am aware and likely will be removed, assuming no objections from x86 on
removing it.


+ if (create_rstor_token(mapped_addr + token_offset, NULL)) {
+ vm_munmap(mapped_addr, size);
+ return -EINVAL;
+ }

Bikeshedding but can we call the function create_shstk_token() instead?
The rstor means absolutely nothing in an arm64 context.

`create_shstk_token` it is. Much better name. Thanks.


+SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags)
+{
+ bool set_tok = flags & SHADOW_STACK_SET_TOKEN;
+ unsigned long aligned_size;
+
+ if (!cpu_supports_shadow_stack())
+ return -EOPNOTSUPP;
+
+ if (flags & ~SHADOW_STACK_SET_TOKEN)
+ return -EINVAL;

This needs SHADOW_STACK_SET_MARKER for arm64.

Ack.


+ if (addr && (addr & (PAGE_SIZE - 1)))
+ return -EINVAL;

if (!PAGE_ALIGNED(addr))

+int shstk_setup(void)
+{

This is half of the implementation of the prctl() for enabling shadow
stacks. Looking at the arm64 implementation this rafactoring feels a
bit awkward, we don't have the one flag at a time requiremet that x86
has and we structure things rather differently. I'm not sure that the
arch_prctl() and prctl() are going to line up comfortably...


Yes I was in two minds as well. In x86 case, its anyways arch specific
so, you will land up in arch specific code and then later land in generic
code. In case of arm64 and riscv as well, each will have arch specific
implementation.

I'll give a try on making prctl handling arch agnostic. If it becomes too
kludgy and ugly, may be its best that prctl handling and first time shadow
stack setup is all arch specific.

+ struct thread_shstk *shstk = &current->thread.shstk;

Note for myself, this ^ is not needed. It's x86 specific and I missed this
clean up.

+ unsigned long addr, size;
+
+ /* Already enabled */
+ if (is_shstk_enabled(current))
+ return 0;
+
+ /* Also not supported for 32 bit */
+ if (!cpu_supports_shadow_stack() ||
+ (IS_ENABLED(CONFIG_X86_64) && in_ia32_syscall()))
+ return -EOPNOTSUPP;

We probably need a thread_supports_shstk(),

`is_shstk_enabled(current)` doesn't work?

arm64 has a similar check
for not 32 bit threads and I noted an issue with needing this check
elsewhere.

hmm May be that's why we need `is_shskt_enabled(current)` and another
`thread_supports_shstk` (probably some better name here from someone on list)

And in case we end up having no commonalities in prctl handling (as mentioned
in comment above), may be then its not needed to have a `thread_supports_shstk`
because its needed during prctl
handling.


+ /*
+ * For CLONE_VFORK the child will share the parents shadow stack.
+ * Make sure to clear the internal tracking of the thread shadow
+ * stack so the freeing logic run for child knows to leave it alone.
+ */
+ if (clone_flags & CLONE_VFORK) {
+ set_shstk_base_size(tsk, 0, 0);
+ return 0;
+ }

On arm64 we set the new thread's shadow stack pointer here, the logic
around that can probably also be usefully factored out.

Ok I'll take a look.


+ /*
+ * For !CLONE_VM the child will use a copy of the parents shadow
+ * stack.
+ */
+ if (!(clone_flags & CLONE_VM))
+ return 0;

Here also.

Same comment as above.