Re: [PATCH v4] mm/userfaultfd: detect VMA replacement after copy retry in mfill_copy_folio_retry()

From: David CARLIER

Date: Wed Apr 01 2026 - 04:06:56 EST


Hi Mike,

On Tue, Apr 01, 2026 at 08:49:00AM +0300, Mike Rapoport wrote:
> What does "folio allocated from the original VMA's backing store" exactly
> mean? Why is this a problem?

Fair point, the commit message was vague here. What I meant is:

mfill_atomic_pte_copy() captures ops = vma_uffd_ops(state->vma) and
passes it to __mfill_atomic_pte(). There, ops->alloc_folio() allocates
a folio for the original VMA's inode (e.g. a shmem folio for that
specific shmem inode). Then mfill_copy_folio_retry() drops all locks for
the copy_from_user retry. After mfill_get_vma() re-acquires them,
state->vma may now point to a replacement VMA, but ops is still the
stale pointer from before the drop.

The code then calls ops->filemap_add(folio, state->vma, ...) which
would insert a folio allocated for the old inode into the new VMA's
backing store. If the VMA changed type entirely (e.g. shmem -> anon),
ops->filemap_add could be operating on a VMA that has no business
receiving this folio.

> First, this a pre-existing and TBH quite theoretical bug and it was there
> since the very beginning, so it should not be added as a fixup for the
> uffd+guestmemfd series.

You're right. The race window (VMA replacement during the lock-dropped
copy retry) existed in the original mcopy_atomic_pte() code long before
the vm_uffd_ops refactoring. The Fixes tag pointing at 56a3706fd7f9 was
wrong. I'll drop it and resend as a standalone fix against the original
retry logic.

> Second, I have reservations about vma_snapshot implementation. What
> invariant does it exactly enforce?

The invariant I was going for: "the folio we allocated is still
compatible with the VMA we're about to install it into." Since
alloc_folio() allocates from the VMA's backing file (inode), checking
that vm_file is still the same after re-acquiring locks ensures the
folio matches the inode. The vm_flags comparison was a secondary guard
against permission/type changes during the window.

That said, I can see the vma_snapshot abstraction is doing too much for
what's really needed. Would a simpler approach work better — just
saving vm_file (with get_file/fput) before the drop and comparing it
directly after re-acquiring? That makes the invariant explicit: "same
backing file means the folio is valid for this VMA."

Happy to rework along those lines, or if you have a different approach
in mind I'm open to suggestions.

> > I've fumbled the ball on your [2/2] unlikely() fix ;). Please resend that
> > after -rc1.
>
> This one should go the same route IMO.

Agreed, I'll resend both after -rc1.