Re: [lkp-robot] [mm] 1be7107fbe: kernel_BUG_at_mm/mmap.c

From: Dmitry Safonov
Date: Thu Jun 22 2017 - 06:59:01 EST


On 06/22/2017 04:07 AM, Hugh Dickins wrote:
On Wed, 21 Jun 2017, Linus Torvalds wrote:
On Wed, Jun 21, 2017 at 1:56 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:

I understand. My point is that this check was invalidated by stack-guard-page
a long ago, and this means that we add the user-visible change now.

Yeah. I guess we could consider it an *old* regression that got fixed,
but if people started relying on the regression...

Do you have a pointer to the report for this regression? I must have missed it.

See http://marc.info/?t=149794523000001&r=1&w=2

Ok.

And thinking about it, while that is a silly test-case, the notion of
"create top-down segment, then start populating it _before_ moving the
stack pointer into it" is actually perfectly valid.

So I guess checking against the stack pointer is wrong in that case -
at least if the stack pointer isn't inside that vma to begin with.

So yes, removing that check looks like the right thing to do for now.

Do you want to send me the patch if you already have a commit message etc?

I have a bit of a bad feeling about this.

Perhaps it's just sentimental attachment to all those weird
and ancient stack pointer checks in arch/<some>/fault.c.

We have been inconsistent: cris frv m32r m68k microblaze mn10300
openrisc powerpc tile um x86 have such checks, the others don't.
So that's a good reason to delete them.

But at least at the moment those checks impose some sanity:
just a page less than we had imagined for several years.
Once we remove them, they cannot go back. Should we now
complicate them with an extra page of slop?

I'm not entirely persuaded by your pre-population argument:
it's perfectly possible to prepare a MAP_GROWSDOWN area with
an initial size, that's populated in a normal way, before handing
off for stack expansion - isn't it?

I'd be interested to hear more about that (redhat internal) bug
report that Oleg mentions: whether it gives stronger grounds for
making this sudden change than the CRIU testcase.

Well, if all the deal is in CRIU testcase - it can be easily reworked.
The question - will it break anything else?

Maybe it's better to disable this check on the release and enable it
back for v4.13 kernel, so if it'll break some user-space, it'll be
caught on linux-next.


I can go ahead and create a patch if Oleg is not there at the
moment - but I might prefer his or your name on it - particularly
if we're rushing it in before consulting the arch maintainers
whose work we would be deleting.

Queasily,
Hugh


--
Dmitry