Re: [PATCH] x86/shstk: Free the thread shadow stack before disassociating from the mm

From: Mark Brown
Date: Wed Sep 11 2024 - 14:49:31 EST


On Wed, Sep 11, 2024 at 06:01:00PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2024-09-10 at 23:56 +0100, Mark Brown wrote:

> > When using shadow stacks the kernel will transparently allocate a shadow
> > stack for each thread. The intention is that this will be freed when the
> > thread exits but currently this doesn't actually happen. The deallocation
> > is done by shstk_free() which is called from exit_thread() and has a guard
> > to check for !tsk->mm due to the use of vm_unmap(). This doesn't actually
> > do anything since in do_exit() we call exit_mm() prior to thread_exit() and
> > exit_mm() disassociates the task from the mm and clears tsk->mm. The result
> > is that no shadow stacks will be freed until the process exits, leaking
> > memory for any process which creates and destroys threads.

...

> > It is entirely possible I am missing something here, I don't have a
> > system that allows me to test shadow stack support directly and have
> > only checked this by inspection and tested with my arm64 GCS series.
> > If this makes sense it'll need to become a dependency for GCS.

> The common cleanup case is via deactivate_mm()->shstk_free(), which happens when
> the MM is still attached. But there is also an exit_thread() caller in the fork
> failure patch (see copy_process()).

Ah, yes - glad I was missing something! I saw exit_thread() doing the
right thing in the error paths but not deactivate_mm().

> A quick search through the arm series and I don't see deactivate_mm()
> implementation, and instead a separate cleanup solution. Could that be the
> reason why you saw the leak on arm? Considering the trickiness of the auto
> allocated shadow stacks lifecycle, I think it would be great if all the
> implementations had common logic. If possible at least.

Yes, it's because we don't have a deactivate_mm() implementation and I
didn't add one (or lost it at some point), effectively this patch is
just adding deactivate_mm() by another name. The hook being a macro
definition in the header caught me out I think, I was probably just
using regular grep not git grep when I went looking.

I was actually considering what some factoring out might look like in
the context of the clone3() work, as incremental work on top of landing
the ABI so we try to avoid introducing yet more rounds of discussion to
delay that. map_shadow_stack() as well.

> BTW, two more notes on this whole area:
> 1. 99% sure glibc has some tests that catch leaks like hypothesized here, by
> watching for memory grown after repeated thread exits. IIRC I introduced a
> shadow stack leak at some point during development that failed the test.

Guess how this was noticed! It's tst-create-detached.c, with IIRC
tweaks to the environment (I think it was turning overcommit off was the
main thing) to make the test actually detect something itself.

> 2. Weijiang (CCed) is working on a fix for case in the opposite direction. An
> error path that attempts to free the shadow stack twice and triggers a warning.

Ack.

Attachment: signature.asc
Description: PGP signature