Re: [PATCH v9 06/24] mm: make pte_unmap_same compatible with SPF
From: David Rientjes
Date: Tue Apr 03 2018 - 16:40:26 EST
On Tue, 3 Apr 2018, Jerome Glisse wrote:
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 21b1212a0892..4bc7b0bdcb40 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2309,21 +2309,29 @@ static bool pte_map_lock(struct vm_fault *vmf)
> > * parts, do_swap_page must check under lock before unmapping the pte and
> > * proceeding (but do_wp_page is only called after already making such a check;
> > * and do_anonymous_page can safely check later on).
> > + *
> > + * pte_unmap_same() returns:
> > + * 0 if the PTE are the same
> > + * VM_FAULT_PTNOTSAME if the PTE are different
> > + * VM_FAULT_RETRY if the VMA has changed in our back during
> > + * a speculative page fault handling.
> > */
> > -static inline int pte_unmap_same(struct mm_struct *mm, pmd_t *pmd,
> > - pte_t *page_table, pte_t orig_pte)
> > +static inline int pte_unmap_same(struct vm_fault *vmf)
> > {
> > - int same = 1;
> > + int ret = 0;
> > +
> > #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
> > if (sizeof(pte_t) > sizeof(unsigned long)) {
> > - spinlock_t *ptl = pte_lockptr(mm, pmd);
> > - spin_lock(ptl);
> > - same = pte_same(*page_table, orig_pte);
> > - spin_unlock(ptl);
> > + if (pte_spinlock(vmf)) {
> > + if (!pte_same(*vmf->pte, vmf->orig_pte))
> > + ret = VM_FAULT_PTNOTSAME;
> > + spin_unlock(vmf->ptl);
> > + } else
> > + ret = VM_FAULT_RETRY;
> > }
> > #endif
> > - pte_unmap(page_table);
> > - return same;
> > + pte_unmap(vmf->pte);
> > + return ret;
> > }
> >
> > static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va, struct vm_area_struct *vma)
> > @@ -2913,7 +2921,8 @@ int do_swap_page(struct vm_fault *vmf)
> > int exclusive = 0;
> > int ret = 0;
> >
> > - if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte))
> > + ret = pte_unmap_same(vmf);
> > + if (ret)
> > goto out;
> >
>
> This change what do_swap_page() returns ie before it was returning 0
> when locked pte lookup was different from orig_pte. After this patch
> it returns VM_FAULT_PTNOTSAME but this is a new return value for
> handle_mm_fault() (the do_swap_page() return value is what ultimately
> get return by handle_mm_fault())
>
> Do we really want that ? This might confuse some existing user of
> handle_mm_fault() and i am not sure of the value of that information
> to caller.
>
> Note i do understand that you want to return retry if anything did
> change from underneath and thus need to differentiate from when the
> pte value are not the same.
>
I think VM_FAULT_RETRY should be handled appropriately for any user of
handle_mm_fault() already, and would be surprised to learn differently.
Khugepaged has the appropriate handling. I think the concern is whether a
user is handling anything other than VM_FAULT_RETRY and VM_FAULT_ERROR
(which VM_FAULT_PTNOTSAME is not set in)? I haven't done a full audit of
the users.