On Fri, Jul 26, 2024 at 06:02:17PM +0200, David Hildenbrand wrote:
On 26.07.24 17:36, Peter Xu wrote:
On Thu, Jul 25, 2024 at 08:39:54PM +0200, David Hildenbrand wrote:
pte_lockptr() is the only *_lockptr() function that doesn't consume
what would be expected: it consumes a pmd_t pointer instead of a pte_t
pointer.
Let's change that. The two callers in pgtable-generic.c are easily
adjusted. Adjust khugepaged.c:retract_page_tables() to simply do a
pte_offset_map_nolock() to obtain the lock, even though we won't actually
be traversing the page table.
This makes the code more similar to the other variants and avoids other
hacks to make the new pte_lockptr() version happy. pte_lockptr() users
reside now only in pgtable-generic.c.
Maybe, using pte_offset_map_nolock() is the right thing to do because
the PTE table could have been removed in the meantime? At least it sounds
more future proof if we ever have other means of page table reclaim.
I think it can't change, because anyone who wants to race against this
should try to take the pmd lock first (which was held already)?
That doesn't explain why it is safe for us to assume that after we took the
PMD lock that the PMD actually still points at a completely empty page
table. Likely it currently works by accident, because we only have a single
such user that makes this assumption. It might certainly be a different once
we asynchronously reclaim page tables.
I think it's safe because find_pmd_or_thp_or_none() returned SUCCEED, and
we're holding i_mmap lock for read. I don't see any way that this pmd can
become a non-pgtable-page.
I meant, AFAIU tearing down pgtable in whatever sane way will need to at
least take both mmap write lock and i_mmap write lock (in this case, a file
mapping), no?
But yes, the PMD cannot get modified while we hold the PMD lock, otherwise
we'd be in trouble
I wonder an open coded "ptlock_ptr(page_ptdesc(pmd_page(*pmd)))" would be
nicer here, but only if my understanding is correct.
I really don't like open-coding that. Fortunately we were able to limit the
use of ptlock_ptr to a single user outside of arch/x86/xen/mmu_pv.c so far.
I'm fine if you prefer like that; I don't see it a huge deal to me.