Re: [PATCH v8 0/4] Introduce mseal

From: Theo de Raadt
Date: Fri Feb 02 2024 - 12:09:29 EST


Another interaction to consider is sigaltstack().

In OpenBSD, sigaltstack() forces MAP_STACK onto the specified
(pre-allocated) region, because on kernel-entry we require the "sp"
register to point to a MAP_STACK region (this severely damages ROP pivot
methods). Linux does not have MAP_STACK enforcement (yet), but one day
someone may try to do that work.

This interacted poorly with mimmutable() because some applications
allocate the memory being provided poorly. I won't get into the details
unless pushed, because what we found makes me upset. Over the years,
we've upstreamed diffs to applications to resolve all the nasty
allocation patterns. I think the software ecosystem is now mostly
clean.

I suggest someone in Linux look into whether sigaltstack() is a mseal()
bypass, perhaps somewhat similar to madvise MADV_FREE, and consider the
correct strategy.

This is our documented strategy:

On OpenBSD some additional restrictions prevent dangerous address space
modifications. The proposed space at ss_sp is verified to be
contiguously mapped for read-write permissions (no execute) and incapable
of syscall entry (see msyscall(2)). If those conditions are met, a page-
aligned inner region will be freshly mapped (all zero) with MAP_STACK
(see mmap(2)), destroying the pre-existing data in the region. Once the
sigaltstack is disabled, the MAP_STACK attribute remains on the memory,
so it is best to deallocate the memory via a method that results in
munmap(2).

OK, I better provide the details of what people were doing.
sigaltstacks() in .data, in .bss, using malloc(), on a buffer on the
stack, we even found one creating a sigaltstack inside a buffer on a
pthread stack. We told everyone to use mmap() and munmap(), with MAP_STACK
if #ifdef MAP_STACK finds a definition.