Re: [PATCH v2 19/20] x86/mm: Add speculative pagefault handling
From: Laurent Dufour
Date: Tue Aug 29 2017 - 10:58:43 EST
On 29/08/2017 16:50, Laurent Dufour wrote:
> On 21/08/2017 09:29, Anshuman Khandual wrote:
>> On 08/18/2017 03:35 AM, Laurent Dufour wrote:
>>> From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>>>
>>> Try a speculative fault before acquiring mmap_sem, if it returns with
>>> VM_FAULT_RETRY continue with the mmap_sem acquisition and do the
>>> traditional fault.
>>>
>>> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>>>
>>> [Clearing of FAULT_FLAG_ALLOW_RETRY is now done in
>>> handle_speculative_fault()]
>>> [Retry with usual fault path in the case VM_ERROR is returned by
>>> handle_speculative_fault(). This allows signal to be delivered]
>>> Signed-off-by: Laurent Dufour <ldufour@xxxxxxxxxxxxxxxxxx>
>>> ---
>>> arch/x86/include/asm/pgtable_types.h | 7 +++++++
>>> arch/x86/mm/fault.c | 19 +++++++++++++++++++
>>> 2 files changed, 26 insertions(+)
>>>
>>> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
>>> index bf9638e1ee42..4fd2693a037e 100644
>>> --- a/arch/x86/include/asm/pgtable_types.h
>>> +++ b/arch/x86/include/asm/pgtable_types.h
>>> @@ -234,6 +234,13 @@ enum page_cache_mode {
>>> #define PGD_IDENT_ATTR 0x001 /* PRESENT (no other attributes) */
>>> #endif
>>>
>>> +/*
>>> + * Advertise that we call the Speculative Page Fault handler.
>>> + */
>>> +#ifdef CONFIG_X86_64
>>> +#define __HAVE_ARCH_CALL_SPF
>>> +#endif
>>> +
>>> #ifdef CONFIG_X86_32
>>> # include <asm/pgtable_32_types.h>
>>> #else
>>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
>>> index 2a1fa10c6a98..4c070b9a4362 100644
>>> --- a/arch/x86/mm/fault.c
>>> +++ b/arch/x86/mm/fault.c
>>> @@ -1365,6 +1365,24 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code,
>>> if (error_code & PF_INSTR)
>>> flags |= FAULT_FLAG_INSTRUCTION;
>>>
>>> +#ifdef __HAVE_ARCH_CALL_SPF
>>> + if (error_code & PF_USER) {
>>> + fault = handle_speculative_fault(mm, address, flags);
>>> +
>>> + /*
>>> + * We also check against VM_FAULT_ERROR because we have to
>>> + * raise a signal by calling later mm_fault_error() which
>>> + * requires the vma pointer to be set. So in that case,
>>> + * we fall through the normal path.
>>
>> Cant mm_fault_error() be called inside handle_speculative_fault() ?
>> Falling through the normal page fault path again just to raise a
>> signal seems overkill. Looking into mm_fault_error(), it seems they
>> are different for x86 and powerpc.
>>
>> X86:
>>
>> mm_fault_error(struct pt_regs *regs, unsigned long error_code,
>> unsigned long address, struct vm_area_struct *vma,
>> unsigned int fault)
>>
>> powerpc:
>>
>> mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
>>
>> Even in case of X86, I guess we would have reference to the faulting
>> VMA (after the SRCU search) which can be used to call this function
>> directly.
>
> Yes I think this is doable in the case of x86.
Indeed this is not doable as the vma pointer is not returned by
handle_speculative_fault() and this is not possible to return it because
once srcu_read_unlock() is called, the pointer is no more safe.