Re: [PATCH v1 0/2] mm/hugetlb: fix hugetlb vs. core-mm PT locking

From: Andrew Morton
Date: Thu Jul 25 2024 - 16:41:46 EST


On Thu, 25 Jul 2024 20:39:53 +0200 David Hildenbrand <david@xxxxxxxxxx> wrote:

> Working on another generic page table walker that tries to avoid
> special-casing hugetlb, I found a page table locking issue with hugetlb
> folios that are not mapped using a single PMD/PUD.
>
> For some hugetlb folio sizes, GUP will take different page table locks
> when walking the page tables than hugetlb when modifying the page tables.
>
> I did not actually try reproducing an issue, but looking at
> follow_pmd_mask() where we might be rereading a PMD value multiple times
> it's rather clear that concurrent modifications are rather unpleasant.
>
> In follow_page_pte() we might be better in that regard -- ptep_get() does
> a READ_ONCE() -- but who knows what else could happen concurrently in
> some weird corner cases (e.g., hugetlb folio getting unmapped and freed).
>
> Did some basic sanity testing with various hugetlb sizes on x86-64 and
> arm64. Maybe I'll find some time to actually write a simple reproducer in
> the common weeks, so this wouldn't have to be all-theoretical for now.

When can we be confident that this change is merge-worthy?

> Only v6.10 is affected, so the #1 can be simply backported as a prereq
> patch along with the real fix.

I'll add the same Fixes: to [1/2], and cc:stable.