Re: [PATCH v4 15/29] x86/mm/64: Enable vmapped stacks
From: Brian Gerst
Date: Mon Jun 27 2016 - 12:18:09 EST
On Mon, Jun 27, 2016 at 11:54 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Mon, Jun 27, 2016 at 8:22 AM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>> On Mon, Jun 27, 2016 at 8:12 AM, Brian Gerst <brgerst@xxxxxxxxx> wrote:
>>> On Mon, Jun 27, 2016 at 11:01 AM, Brian Gerst <brgerst@xxxxxxxxx> wrote:
>>>> On Sun, Jun 26, 2016 at 5:55 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>>>> #ifdef CONFIG_X86_64
>>>>> /* Runs on IST stack */
>>>>> dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
>>>>> {
>>>>> static const char str[] = "double fault";
>>>>> struct task_struct *tsk = current;
>>>>> +#ifdef CONFIG_VMAP_STACK
>>>>> + unsigned long cr2;
>>>>> +#endif
>>>>>
>>>>> #ifdef CONFIG_X86_ESPFIX64
>>>>> extern unsigned char native_irq_return_iret[];
>>>>> @@ -332,6 +350,20 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
>>>>> tsk->thread.error_code = error_code;
>>>>> tsk->thread.trap_nr = X86_TRAP_DF;
>>>>>
>>>>> +#ifdef CONFIG_VMAP_STACK
>>>>> + /*
>>>>> + * If we overflow the stack into a guard page, the CPU will fail
>>>>> + * to deliver #PF and will send #DF instead. CR2 will contain
>>>>> + * the linear address of the second fault, which will be in the
>>>>> + * guard page below the bottom of the stack.
>>>>> + */
>>>>> + cr2 = read_cr2();
>>>>> + if ((unsigned long)tsk->stack - 1 - cr2 < PAGE_SIZE)
>>>>> + handle_stack_overflow(
>>>>> + "kernel stack overflow (double-fault)",
>>>>> + regs, cr2);
>>>>> +#endif
>>>>
>>>> Is there any other way to tell if this was from a page fault? If it
>>>> wasn't a page fault then CR2 is undefined.
>>>
>>> I guess it doesn't really matter, since the fault is fatal either way.
>>> The error message might be incorrect though.
>>>
>>
>> It's at least worth a comment, though. Maybe I should check if
>> regs->rsp is within 40 bytes of the bottom of the stack, too, such
>> that delivery of an inner fault would have double-faulted assuming the
>> inner fault didn't use an IST vector.
>>
>
> How about:
>
> /*
> * If we overflow the stack into a guard page, the CPU will fail
> * to deliver #PF and will send #DF instead. CR2 will contain
> * the linear address of the second fault, which will be in the
> * guard page below the bottom of the stack.
> *
> * We're limited to using heuristics here, since the CPU does
> * not tell us what type of fault failed and, if the first fault
> * wasn't a page fault, CR2 may contain stale garbage. To mostly
> * rule out garbage, we check if the saved RSP is close enough to
> * the bottom of the stack to cause exception delivery to fail.
> * The worst case is 7 stack slots: one for alignment, five for
> * SS..RIP, and one for the error code.
> */
> tsk_stack = (unsigned long)task_stack_page(tsk);
> if (regs->rsp <= tsk_stack + 7*8 && regs->rsp > tsk_stack - PAGE_SIZE) {
> /* A double-fault due to #PF delivery failure is plausible. */
> cr2 = read_cr2();
> if (tsk_stack - 1 - cr2 < PAGE_SIZE)
> handle_stack_overflow(
> "kernel stack overflow (double-fault)",
> regs, cr2);
> }
I think RSP anywhere in the guard page would be best, since it could
have been decremented by a function prologue into the guard page
before an access that triggers the page fault.
--
Brian Gerst