Re: [PATCH 0/1] expand_downwards: don't require the gap if !vm_prev

From: Oleg Nesterov
Date: Thu Jun 29 2017 - 11:19:11 EST


On 06/28, Linus Torvalds wrote:
>
> On Wed, Jun 28, 2017 at 10:52 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > Now that the stack-guard-page has gone, why do we need to allow to grow
> > into the previous VM_GROWSDOWN vma? IOW, why we can not simply remove
> > the VM_GROWSDOWN check in expand_downwards() ?
>
> Because the "prev" vma may actually be the original vma.
>
> I think I described it in an earlier thread, but what happened at
> least once was:
>
> - program has some part that uses a lot of stack for part of the
> execution for some temp buffer or deep recursion or whatever
>
> - somebody noticed this, and decided to free up the no-longer-used
> pages by doing a "munmap()" after the program was done with that part
> of the stack
>
> - but the "munmap()" wasn't complete (maybe it only accounted for the
> explicitly used buffer, whatever), so the munmap actually didn't just
> remove the no-longer used bottom of the stack, it actually split the
> stack segment into two (with a small remaining stack turd that was the
> *real* bottom of the deep stack that used to exist)

Ah, OK, thanks...



> As to your patch: I would prefer to actually keep the new failure
> behavior of unconditionally breaking a big stack expansion), unless
> there's an actual thing it breaks.

Hmm. May be you misread this patch? Or I misunderstood.

> In fact, I'd even be quite open to adding a kernel warning about badly
> behaved binaries that grow their stack by a big amount in one go.

Yes, but this is another story.

Currently expand_downwards(address) does

if (address < stack_guard_gap)
return -ENOMEM;

This has nothing to do with "by how much it needs to grow", this simply
forbids the bottom of stack below stack_guard_gap. Why?

I don't think this patch can make any difference in practice, it just
tries to make this logic more consistent/understandable.

For example. Suppose that stack_guard_gap = 1M (default). Now,

addr = 512K; // any addr <= stack_guard_gap;
char *stack = mmap(addr, MAP_FIXED|MAP_GROWSDOWN, PAGE_SIZE);

*stack = 0;
stack -= PAGE_SIZE;
*stack = 0;

The first store will always succeed, the 2nd one will always fail even
if (likely) there is no another vma below. This looks strange to me.

Oleg.