Re: [PATCH mm-hotfixes] mm/hugetlb: avoid false positive lockdep assertion
From: David Hildenbrand (Arm)
Date: Wed May 13 2026 - 06:18:42 EST
On 5/13/26 10:56, Lorenzo Stoakes wrote:
> Commit 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split, not
> before") changed the locking model around hugetlbfs PMD unsharing on VMA
> split, but did not update the function which asserts the locks,
> hugetlb_vma_assert_locked().
>
> This function asserts that either the hugetlb VMA lock is held (if a shared
> mapping) or that the reservation map lock is held (if private).
>
> If you get an unfortunate race between something which results in one of
> these locks being released and a hugetlb split and you have CONFIG_LOCKDEP
"hugetlb split": I assume you used that terminology because of hugetlb_split().
Which is all just rather nasty #justhugetlbthings
"hugetlb VMA split" is probably easier to get.
> enabled, you can therefore see a false positive assertion arise when there
> is in fact no issue.
>
> Since this change introduced a new take_locks parameter to
> hugetlb_unshare_pmds(), which, when set to false, indicates that locking is
> sufficient, simply pass this to the unsharing logic and predicate the
> lock assertions on this.
>
> This is safe, as we already asserted the file rmap lock and the VMA write
> lock prior to this (implying exclusive mmap write lock), so we cannot be
> raced by either rmap or page fault page table walkers which the asserted
> locks are intended to protect against (we don't mind GUP-fast).
>
> Separate out huge_pmd_unshare() into __huge_pmd_unshare() to add a
> check_locks parameter, and update hugetlb_unshare_pmds() to pass this
> parameter to it.
>
> This leaves all other callers of huge_pmd_unshare() still correctly
> asserting the locks.
>
> The below reproducer will trigger the assert in a kernel with
> CONFIG_LOCKDEP enabled by racing process teardown (which will release the
> hugetlb lock) against a hugetlb split.
>
> void execute_one(void)
> {
> void *ptr;
> pid_t pid;
>
> /*
> * Create a hugetlb mapping spanning a PUD entry.
> *
> * We force the hugetlb page allocation with populate and
> * noreserve.
> *
> * |---------------------|
> * | |
> * |---------------------|
> * 0 PUD boundary
> */
> ptr = mmap(0, PUD_SIZE, PROT_READ | PROT_WRITE,
> MAP_FIXED | MAP_SHARED | MAP_ANON |
> MAP_NORESERVE | MAP_HUGETLB | MAP_POPULATE,
> -1, 0);
> if (ptr == MAP_FAILED) {
> perror("mmap");
> exit(EXIT_FAILURE);
> }
>
> /*
> * Fork but with a bogus stack pointer so we try to execute code in
> * a non-VM_EXEC VMA, causing segfault + teardown via exit_mmap().
> *
> * The clone will cause PMD page table sharing between the
> * processes first via:
> * copy_process() -> ... -> huge_pte_alloc() -> huge_pmd_share()
> *
> * Then tear down and release the hugetlb 'VMA' lock via:
> * exit_mmap() -> ... -> vma_close() -> hugetlb_vma_lock_free()
> */
> pid = syscall(__NR_clone, 0, 2 * PMD_SIZE, 0, 0, 0);
> if (pid < 0) {
> perror("clone");
> exit(EXIT_FAILURE);
> } if (pid == 0) {
> /* Pop stack... */
> return;
> }
>
> /*
> * We are the parent process.
> *
> * Race the child process's teardown with a PMD unshare.
> *
> * We do this by triggering:
> *
> * __split_vma() -> hugetlb_split() -> hugetlb_unshare_pmds()
> *
> * Which, importantly, doesn't hold the hugetlb VMA lock (nor can
> * it), meaning we assert in hugetlb_vma_assert_locked().
> *
> * .
> * |----------.----------|
> * | . |
> * |----------.----------|
> * 0 . PUD boundary
> */
> mmap(0, PUD_SIZE / 2, PROT_READ | PROT_WRITE,
> MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
> }
>
> int main(void)
> {
> int i;
>
> /* Kick off fork children. */
> for (i = 0; i < NUM_FORKS; i++) {
> pid_t pid = fork();
>
> if (pid < 0) {
> perror("fork");
> exit(EXIT_FAILURE);
> }
>
> /* Fork children do their work and exit. */
> if (!pid) {
> int j;
>
> for (j = 0; j < NUM_ITERS; j++)
> execute_one();
> return EXIT_SUCCESS;
> }
> }
>
> /* If we succeeded, wait on children. */
> for (i = 0; i < NUM_FORKS; i++)
> wait(NULL);
>
> return EXIT_SUCCESS;
> }
>
> Fixes: 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split, not before")
> Cc: <stable@xxxxxxxxxxxxxxx>
> Signed-off-by: Lorenzo Stoakes <ljs@xxxxxxxxxx>
> ---
LGTM, all rather nasty with "take_locks" parameters ...
Acked-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
--
Cheers,
David