Re: [PATCH v1 1/2] mm: let pte_lockptr() consume a pte_t pointer

From: David Hildenbrand
Date: Tue Jul 30 2024 - 12:14:02 EST


On 30.07.24 18:08, Marek Szyprowski wrote:
On 30.07.2024 17:49, David Hildenbrand wrote:
On 30.07.24 17:45, David Hildenbrand wrote:
On 30.07.24 17:30, Marek Szyprowski wrote:
On 25.07.2024 20:39, David Hildenbrand wrote:
pte_lockptr() is the only *_lockptr() function that doesn't consume
what would be expected: it consumes a pmd_t pointer instead of a pte_t
pointer.

Let's change that. The two callers in pgtable-generic.c are easily
adjusted. Adjust khugepaged.c:retract_page_tables() to simply do a
pte_offset_map_nolock() to obtain the lock, even though we won't
actually
be traversing the page table.

This makes the code more similar to the other variants and avoids
other
hacks to make the new pte_lockptr() version happy. pte_lockptr() users
reside now only in  pgtable-generic.c.

Maybe, using pte_offset_map_nolock() is the right thing to do because
the PTE table could have been removed in the meantime? At least it
sounds
more future proof if we ever have other means of page table reclaim.

It's not quite clear if holding the PTE table lock is really required:
what if someone else obtains the lock just after we unlock it? But
we'll
leave that as is for now, maybe there are good reasons.

This is a preparation for adapting hugetlb page table locking logic to
take the same locks as core-mm page table walkers would.

Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>

This patch landed in today's linux-next as commit e98970a1d2d4 ("mm:
let
pte_lockptr() consume a pte_t pointer"). Unfortunately it causes the
following issue on most of my ARM 32bit based test boards:


That is ... rather surprising.

The issue below seems to point at __pte_offset_map_lock(), where we
essentially convert from

ptlock_ptr(page_ptdesc(pmd_page(*pmd)));

to

ptlock_ptr(virt_to_ptdesc(pte));

I'm wondering, is highmem involved here such that the PTE would be
kmap'ed and virt_to_page() would not do what we would expect it to do?

Yes, highmem is enabled on those boards and all of them have 1GB+ of
RAM. For other kernel configuration options see
arch/arm/configs/exynos_defconfig.

Yes, pretty sure that's it. virt_to_page() won't work on kmaped pages.
So looks like we cannot easily do the conversion in this patch. :(

We'll have to get hacky in patch #2 instead.

@Andrew, can you drop both patches for now? I'll send a v2 that
essentially does in v2 on top something like:

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index da800e56fe590..c2e330b1eee21 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -963,7 +963,13 @@ static inline spinlock_t *huge_pte_lockptr(struct hstate *h,
* will give us the same result: the per-MM PT lock.
*/
if (huge_page_size(h) < PMD_SIZE)
- return pte_lockptr(mm, pte);
+ /*
+ * pte_lockptr() needs the PMD, which we don't have. Because of
+ * highmem we cannot convert pte_lockptr() to consume a pte.
+ * But as we never have highmem page tables in hugetlb, we can
+ * safely use virt_to_ptdesc() here.
+ */
+ return ptlock_ptr(virt_to_ptdesc(pte));
else if (huge_page_size(h) < PUD_SIZE)
return pmd_lockptr(mm, (pmd_t *) pte);
return pud_lockptr(mm, (pud_t *) pte);


--
Cheers,

David / dhildenb