Re: [PATCH 3.10 268/268] mm: larger stack guard gap, between vmas

From: Willy Tarreau
Date: Wed Jun 21 2017 - 12:27:45 EST


Hey Hugh,

On Wed, Jun 21, 2017 at 09:18:23AM +0200, Willy Tarreau wrote:
> Thanks a lot, I'll include your patch and will test it again. And
> yes, I intend to merge Helge's fix once it lands into mainline (maybe
> it is right now, I didn't check) and possibly other ones you might be
> working on depending on various feedback.

So here's a quick update, I've built a kernel with my initial backport
fixed by applying your patch on top of it and have run various tests
on it. All I can say is that for me it works. I've instrumented a
little bit more my test program (which I'm attaching).

With 3.10.106+ unpatched, I get this :

admin@formilux:~$ ulimit -s unlimited
admin@formilux:~$ /tmp/gap 65536 < /dev/null
mmap() failed
stack=0x550024b0 [0x550024b0-0x7fc6d0f0] (717663296 total bytes)
heap~=(nil) [(nil)-(nil)] (0 total bytes)
anon~=0x7ffef000 [0x2aaab000-0x7ffef000] (1431584768 total bytes)
heap...stack=2143736048 bytes
heap+anon+stack=2149248064 bytes
08048000-080d6000 r-xp 00000000 00:0d 3263 /var/tmp/gap
080d6000-080d8000 rw-p 0008d000 00:0d 3263 /var/tmp/gap
080d8000-080fb000 rw-p 00000000 00:00 0 [heap]
2aaab000-5537d000 rw-p 00000000 00:00 0 [stack:1813]
55382000-7fc7f000 rw-p 00000000 00:00 0
7fc7f000-7ffff000 rw-p 00000000 00:00 0
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
rounds: 10949

The output is not obvious, it dumps its last known pointer for each
VMA and the upper and lower known as well, then dumps the contents
of /proc/self/maps either after the segfault or after a failed
alloc. Here you can see that the stack was seen equal to 0x550024b0,
which totally belongs to the anon area, which is reported as stack
in /proc/self/maps probably due to the pointers crossing each other.
The stack really was the next VMA (55382000-7fc7f000). So we had a
significant collision here, with about ~56 stack accesses being made
in the anon VMA.

With 3.10.107-rc and your latest fix I get this :

admin@formilux:~$ ulimit -s unlimited
admin@formilux:~$ /tmp/gap 65536 </dev/null
SEGV caught
stack=0x552cb240 [0x552db250-0x7fa15960] (712222480 total bytes)
heap~=(nil) [(nil)-(nil)] (0 total bytes)
anon~=0x7fa27000 [0x2aaab000-0x7fa27000] (1425522688 total bytes)
heap...stack=2141280608 bytes
heap+anon+stack=2137745168 bytes
08048000-080d6000 r-xp 00000000 00:0d 3263 /var/tmp/gap
080d6000-080d8000 rw-p 0008d000 00:0d 3263 /var/tmp/gap
080d8000-080fb000 rw-p 00000000 00:00 0 [heap]
2aaab000-551cd000 rw-p 00000000 00:00 0
552db000-7fa27000 rw-p 00000000 00:00 0
7fa27000-7fa37000 rw-p 00000000 00:00 0
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]

There's no collision, the stack access stopped on the guard page.

The program is ugly but usable. If you pass it a negative size it will
first fill the heap for this absolute size, then switch to anon. It's
useless now but who knows.

So for me it's OK now. I'm attaching the test program. Greg, Ben, Sasha,
you need a 2 GB i386 machine to reliably test it (booting in a VM is OK).
I can provide you with a small system image offline if needed.

Cheers,
Willy
/* stack guard gap testing - 2017/06/17 - w.tarreau */

#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <alloca.h>
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

static volatile void *stack_h, *stack_l, *stack_last;
static volatile void *heap_h, *heap_l, *heap_last;
static volatile void *anon_h, *anon_l, *anon_last;

char buf[256];
int len;

void dump()
{

printf("stack=%p [%p-%p] (%lu total bytes)\n", stack_last, stack_l, stack_h, (long)(stack_h-stack_l));
printf("heap~=%p [%p-%p] (%lu total bytes)\n", heap_last, heap_l, heap_h, (long)(heap_h-heap_l));
printf("anon~=%p [%p-%p] (%lu total bytes)\n", anon_last, anon_l, anon_h, (long)(anon_h-anon_l));
printf("heap...stack=%lu bytes\n", (long)(stack_h - heap_l));
printf("heap+anon+stack=%lu bytes\n", (long)(stack_h-stack_l+heap_h-heap_l+anon_h-anon_l));
close(0);
open("/proc/self/maps", O_RDONLY);
while ((len = read(0, buf, sizeof(buf))) > 0)
write(1, buf, len);
close(0);
}

void segv(int sig, siginfo_t *si, void *uc)
{
printf("SEGV caught\n");
dump();
exit(1);
}

main(int argc, char **argv)
{
void *p;
int round;
long step = -65536;
stack_t ss;
struct sigaction sa;

if (argc > 1)
step = atol(argv[1]);

ss.ss_flags = 0;
ss.ss_size = SIGSTKSZ;
ss.ss_sp = malloc(ss.ss_size);
sigaltstack(&ss, NULL);

sa.sa_handler = NULL;
sa.sa_sigaction = segv;
sa.sa_flags = SA_ONESHOT | SA_ONSTACK;
sigemptyset(&sa.sa_mask);
sigaction(SIGSEGV, &sa, NULL);

stack_h = NULL;
round = 0;
while (1) {
if (step > 0) {
p = mmap(0, step, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (p == MAP_FAILED)
break;
anon_last = p;

if (!anon_l || p < anon_l)
anon_l = p;

if (!anon_h || p > anon_h)
anon_h = p;
}
else {
/* use sbrk() nor malloc to ensure libc doesn't use mmap() under us */
p = sbrk(0);
if (sbrk(-step) == (void *)-1) {
/* continue with mmap() */
step = -step;
continue;
//break;
}

heap_last = p;

if (!heap_l || p < heap_l)
heap_l = p;

if (!heap_h || p > heap_h)
heap_h = p;
}

round++;
getchar();

p = alloca(abs(step));
stack_last = p;
*(char *)p = 0;

//memset(p, 0, step);

if (!stack_l || p < stack_l)
stack_l = p;

if (!stack_h || p > stack_h)
stack_h = p;
}
printf("mmap() failed\n");
dump();
printf("rounds: %d\n", round);
}