Re: [RFC PATCH] x86/mm/fault: Allow stack access below %rsp

From: Waiman Long
Date: Mon Nov 05 2018 - 11:27:10 EST


On 11/02/2018 06:28 PM, Dave Hansen wrote:
> On 11/2/18 12:50 PM, Waiman Long wrote:
>> On 11/02/2018 03:44 PM, Dave Hansen wrote:
>>> On 11/2/18 12:40 PM, Waiman Long wrote:
>>>> The 64k+ limit check is kind of arbitrary. So the check is now removed
>>>> to just let expand_stack() decide if a segmentation fault should happen.
>>> With the 64k check removed, what's the next limit that we bump into? Is
>>> it just the stack_guard_gap space above the next-lowest VMA?
>> I think it is both the stack_guard_gap space above the next lowest VMA
>> and the rlimit(RLIMIT_STACK).
> The gap seems to be hundreds of megabytes, typically where RLIMIT_STACK
> is 8MB by default, so RLIMIT_STACK is likely to be the practical limit
> that will be hit. So, practically, we've taken a ~64k area that we
> would on-demand extend the stack into in one go, and turned that into a
> the full ~8MB area that you could have expanded into anyway, but all at
> once.
>
> That doesn't seem too insane, especially since we don't physically back
> the 8MB or anything. Logically, it also seems like you *should* be able
> to touch any bit of the stack within the rlimit.
>
> But, on the other hand, as our comments say: "Accessing the stack below
> %sp is always a bug." Have we been unsuccessful in convincing our gcc
> buddies of this?

With gcc 4.4.7, the object code for the sample program in the commit log
are:

0x00000000004004c4 <+0>: push %rbp
0x00000000004004c5 <+1>: mov %rsp,%rbp
0x00000000004004c8 <+4>: push %rbx
0x00000000004004c9 <+5>: sub $0x18,%rsp
0x00000000004004cd <+9>: lea -0x2ff8(%rsp),%rax
0x00000000004004d5 <+17>: movq $0x0,(%rax)
0x00000000004004dc <+24>: mov %rsp,%rax
0x00000000004004df <+27>: mov %rax,%rbx
0x00000000004004e2 <+30>: lea -0x3ff8(%rsp),%rax
0x00000000004004ea <+38>: lea -0x43008(%rsp),%rdx
0x00000000004004f2 <+46>: jmp 0x400501 <main+61>
0x00000000004004f4 <+48>: movq $0x0,(%rax)
0x00000000004004fb <+55>: sub $0x1000,%rax
0x0000000000400501 <+61>: cmp %rdx,%rax
0x0000000000400504 <+64>: ja 0x4004f4 <main+48>
0x0000000000400506 <+66>: movq $0x0,(%rdx)
0x000000000040050d <+73>: sub $0x40010,%rsp
0x0000000000400514 <+80>: mov %rsp,%rax
0x0000000000400517 <+83>: add $0xf,%rax
0x000000000040051b <+87>: shr $0x4,%rax
0x000000000040051f <+91>: shl $0x4,%rax
0x0000000000400523 <+95>: mov %rax,-0x18(%rbp)
0x0000000000400527 <+99>: mov $0x400638,%edi
0x000000000040052c <+104>: callq 0x4003b8 <puts@plt>
0x0000000000400531 <+109>: mov $0x0,%eax
0x0000000000400536 <+114>: mov %rbx,%rsp
0x0000000000400539 <+117>: mov -0x8(%rbp),%rbx
0x000000000040053d <+121>: leaveq
0x000000000040053e <+122>: retq

With a newer gcc 4.8.5, the object code becomes

0x000000000040052d <+0>: push %rbp
0x000000000040052e <+1>: mov %rsp,%rbp
0x0000000000400531 <+4>: lea -0x1020(%rsp),%rsp
0x0000000000400539 <+12>: mov $0xfffffffffffc0000,%r11
0x0000000000400540 <+19>: lea (%rsp,%r11,1),%r11
0x0000000000400544 <+23>: cmp %r11,%rsp
0x0000000000400547 <+26>: je 0x400557 <main+42>
0x0000000000400549 <+28>: sub $0x1000,%rsp
0x0000000000400550 <+35>: orq $0x0,(%rsp)
0x0000000000400555 <+40>: jmp 0x400544 <main+23>
0x0000000000400557 <+42>: lea 0x1020(%rsp),%rsp
0x000000000040055f <+50>: mov $0x400600,%edi
0x0000000000400564 <+55>: callq 0x400410 <puts@plt>
0x0000000000400569 <+60>: mov $0x0,%eax
0x000000000040056e <+65>: leaveq
0x000000000040056f <+66>: retq

So gcc had changed to avoid doing that, but my main concern are old
binaries that were compiled with old gcc.

Cheers,
Longman