Re: [PATCH v2 6/9] mm: free user PTE page table pages

From: David Hildenbrand
Date: Wed Sep 01 2021 - 09:58:20 EST

On 01.09.21 15:53, Jason Gunthorpe wrote:
On Thu, Aug 19, 2021 at 11:18:55AM +0800, Qi Zheng wrote:

diff --git a/mm/gup.c b/mm/gup.c
index 2630ed1bb4f4..30757f3b176c 100644
+++ b/mm/gup.c
@@ -500,6 +500,9 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
if (unlikely(pmd_bad(*pmd)))
return no_page_table(vma, flags);
+ if (!pte_try_get(mm, pmd))
+ return no_page_table(vma, flags);
ptep = pte_offset_map_lock(mm, pmd, address, &ptl);

This is not good on a performance path, the pte_try_get() is
locking/locking the same lock that pte_offset_map_lock() is getting.

Yes, and we really need patch #8, anything else is just confusing reviewers.

This would be much better if the map_lock infra could manage the
refcount itself.

I'm also not really keen on adding ptl level locking to all the
currently no-lock paths. If we are doing that then the no-lock paths
should rely on the ptl for alot more of their operations and avoid the
complicatred no-lock data access we have. eg 'pte_try_get()' should
also copy the pte_t under the lock.

Also, I don't really understand how this scheme works with

With the RCU change it in #8 it should work just fine, because RCU synchronize has to wait either until all other CPUs have left the RCU read section, or re-enabled interrupts.


David / dhildenb