Re: [criu] 1M guard page ruined restore

From: Cyrill Gorcunov
Date: Tue Jun 20 2017 - 06:41:47 EST

On Tue, Jun 20, 2017 at 03:23:20AM -0700, Hugh Dickins wrote:
> Sorry for breaking you: we realized there was some risk of that.
> Would it be acceptable to you, to judge which kind of a kernel it is,
> by whether it has a global variable stack_guard_gap? I don't know
> if that would be a horrible hack, or the kind of thing that you're
> used to doing all over the place. Judging by kernel version will
> be awkward, since the patch is being backported to stable kernels.

Wait, maybe we could use VmFlags from /proc/$pid/smaps for that?
I mean we show "gd/gu" flag there is it's stack area. Say we can
add additional flag which would point that we should not delete
guard page from the output. Currently we've in criu

/* Add a guard page only if here is enough space for it */
if ((vma_area->e->flags & MAP_GROWSDOWN) &&
*prev_end < vma_area->e->start)
vma_area->e->start -= PAGE_SIZE; /* Guard page */

So that on the restore we use mmap with MAP_FIXED. Hugh, I'm still
analyzing the problem in criu, maybe this code snippet the only
problem and just lifting up smaps flags will be enough. Just
gimme some more time.

> But I'm surprised by your explanation above: maybe I'm confused,
> or maybe the explanation is different. Because as I see it, the
> change I made in that patch *maintained* consistency for CRIU:
> It used to be the case that there was a gap page included in the
> extent of the stack vma, but it didn't really belong in there,
> therefore show_map_vma() massaged the addresses shown to conceal it.
> Whereas now with the 1be7107fbe18 commit, the gap (page or more)
> is not included in the extent of the stack vma, so there's no
> longer any need to massage the addresses shown to conceal it.
> We do need to understand this fairly quickly, since those stable
> backports will pose more of a problem for you than the v4.12
> release itself.

Seems patches already are in fly for most of distros. So yes,
I'm trying my best right now.