Re: [criu] 1M guard page ruined restore

From: Cyrill Gorcunov
Date: Wed Jun 21 2017 - 12:28:48 EST


On Wed, Jun 21, 2017 at 05:57:30PM +0200, Oleg Nesterov wrote:
> (add Adrian)
>
> On 06/21, Cyrill Gorcunov wrote:
> >
> > The patches for criu are on the fly. Still one of the test case
> > start failing with the new kernels. Basically the test does
> > the following:
>
> Cyrill, please read the last email I sent you in another (private) discussion.
> Most probably you should throw out some tests which assume the kernel has the
> stack-guard-page hack, it was replaced by the stack-guard-hole hack ;)

Yes, thank you.

> > - allocate growsdown memory area
> > - touch first byte (which before the patch force the kernel
> > to extend the stack allocating new page)
> > - touch first-1 byte
> >
> > ---
> > int main(int argc, char **argv)
> > {
> > char *start_addr, *start_addr1, *fake_grow_down, *test_addr, *grow_down;
> > volatile char *p;
> >
> > start_addr = mmap(NULL, PAGE_SIZE * 10, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> > if (start_addr == MAP_FAILED) {
> > printf("Can't mal a new region");
> > return 1;
> > }
> > printf("start_addr %lx\n", start_addr);
> > munmap(start_addr, PAGE_SIZE * 10);
> >
> > fake_grow_down = mmap(start_addr + PAGE_SIZE * 5, PAGE_SIZE,
> > PROT_READ | PROT_WRITE,
> > MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED | MAP_GROWSDOWN, -1, 0);
> > if (fake_grow_down == MAP_FAILED) {
> > printf("Can't mal a new region");
> > return 1;
> > }
> > printf("start_addr %lx\n", fake_grow_down);
> >
> > p = fake_grow_down;
> > *p-- = 'c';
>
> I guess this works? I mean, *p-- = 'c' should not fail...

It fails.

>
> > *p = 'b';
>
> OK, now we need to expand the stack. This can fail or not. This depends on
> whether this vma (created by mmap(MAP_GROWSDOWN) has a stack_guard_gap hole
> between its ->vm_prev.
>
> > function get dropped off. Hugh, it is done on intent and
> > userspace programs have to extend stack manually?
>
> No. a MAP_GROWSDOWN area should grow automatically. Unless the hole between
> vm_prev becomes less than stack_guard_gap.
>
> This is the whole point of guard hole, or guard page we had before. Just the
> previous implementation was not accurate, that is why criu had to have some
> hacks to workaround.
>
> It no longer needs to know about guard hole/page/whatever. Just remove
> (conditionalize) all the MAP_GROWSDOWN code. Except, of course, you still
> need to record MAP_GROWSDOWN in vma_area->e->flags (_vmflag_match), in order
> to restore this vma correctly.

Oleg, look, it seems I've been testing on the wrong VM :) (Sign, so many
opened at once it's easy to forget in which one you're runngin)

Here is the complete code. It supposed to _extend_ stack but it fails
on the latest master + Hugh's [PATCH] mm: fix new crash in unmapped_area_topdown()
---
[root@fc2 criu]# ~/st2
start_addr 7fe6162a8000
start_addr 7fe6163d9000
Segmentation fault (core dumped)
---
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <sys/mman.h>

#define PAGE_SIZE 4096

int main(int argc, char **argv)
{
char *start_addr, *start_addr1, *fake_grow_down, *test_addr, *grow_down;
volatile char *p;

start_addr = mmap(NULL, PAGE_SIZE * 512, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (start_addr == MAP_FAILED) {
printf("Can't mal a new region");
return 1;
}
printf("start_addr %lx\n", start_addr);
munmap(start_addr, PAGE_SIZE * 512);

start_addr += PAGE_SIZE * 300;

fake_grow_down = mmap(start_addr + PAGE_SIZE * 5, PAGE_SIZE,
PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED | MAP_GROWSDOWN, -1, 0);
if (fake_grow_down == MAP_FAILED) {
printf("Can't mal a new region");
return 1;
}
printf("start_addr %lx\n", fake_grow_down);

p = fake_grow_down;
*p-- = 'c';
*p = 'b';

/* overlap the guard page of fake_grow_down */
test_addr = mmap(start_addr + PAGE_SIZE * 3, PAGE_SIZE,
PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
if (test_addr == MAP_FAILED) {
printf("Can't mal a new region");
return 1;
}
printf("test_addr %lx\n", test_addr);

grow_down = mmap(start_addr + PAGE_SIZE * 2, PAGE_SIZE,
PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED | MAP_GROWSDOWN, -1, 0);
if (grow_down == MAP_FAILED) {
printf("Can't mal a new region");
return 1;
}
printf("grow_down %lx\n", grow_down);

munmap(test_addr, PAGE_SIZE);
if (fake_grow_down[0] != 'c' || *(fake_grow_down - 1) != 'b') {
printf("%c %c\n", fake_grow_down[0], *(fake_grow_down - 1));
return 1;
}

p = grow_down;
*p-- = 'z';
*p = 'x';

return 0;
}