Re: [PATCH 0/3] mm: __access_remote_vm with per-VMA lock

From: Suren Baghdasaryan

Date: Fri Jun 19 2026 - 10:21:00 EST


On Fri, Jun 19, 2026 at 6:05 AM David Hildenbrand (Arm)
<david@xxxxxxxxxx> wrote:
>
> >> All GUP does is walk page tables and call fault handlers. userfaultfd is nasty,
> >> but existing page faults must also deal with that having to fallback to the MM
> >> lock, so it sounds like a solvable problem with some churn?
> >
> > Well I think a critical problem here, as pointed out by Suren, is that holding a
> > VMA lock means that the VMAs around you can change and in ways that are quite
> > problematic.
> >
> > E.g. The moment you drop the VMA lock that VMA might get freed and then merged
> > with something else, and the next VMA you consume is the same one you just
> > partially walked, for instance.
> >
> > Now perhaps you could reason your way around this, but I'm pretty sure there are
> > cases where you might actually miss VMAs due to races (Suren knows best).
> >
> > And also without an mmap lock people can unmap and map new VMAs in the range as
> > you go through which might cause weirdness as well.
> >
> > Really, unless you are dealing with a single VMA in the range, I suspect GUP
> > needs to stabilise that whole range.
>
> Well, depends, really. It's not like a all GUP operation that target many pages
> runs exclusively under the mmap lock that would prevent any VMA changes.
>
> With userfaultfd, for example, we drop the lock in between, to lookup the VMA
> again later. There are various paths where __get_user_pages_locked() is
> instructed to grab the mmap lock itself, to even temporarily drop it if the mmap
> lock was dropped.
>
> gup_fast_fallback() grabs some pages to then take the mmap lock. And continue
> from the next address.
>
>
> So it really depends on the use case. I would actually be surprised if there a
> lot of use cases that strictly must block concurrent mremap operations etc.
>
> The important part is that you process each virtual page address requested
> exactly once. If the VMA was merged in the meantime, you continue from that
> address in the previously-processed VMA.
>
>
> Some use cases might indeed want to stabilize the whole range. But I wouldn't
> expect them to opt-in to using per-VMA locks.
>
> Just like with any other page table walker, we cannot just convert all in one
> shot to use per-VMA locks.
>
> >
> > If we could find a way to have GUP fast-path the single VMA case sensibly, then
> > that's probably workable?
>
> Right, that's what I said: start with a single-VMA interface that supports
> getting called with the per-vma lock or the mmap lock.
>
> If we have to fallback to the mmap lock (userfaultd? indicated back by the
> caller), handle it in the caller of that interface for now.
>
> >
> > And I agree special-casing only one place but not others sucks.
>
> Yeah, we're not doing that unless inevitable.
>
> >
> > Perhaps we could find a way to get this improvement without it being quite so
> > 'tacked on' but without needing significant rework of GUP, but in either case I
> > broadly agree we need to improve the codebase as part of the changes.
>
> We shouldn't fear extending GUP in a reasonable way that makes everyone out
> there profit ion the long run :)

I do not disagree with the general premise of making existing
mechanisms work better rather than implementing parallel ones. I'm
just pointing out my findings so far when I moved in that direction
and I'm happy Rik posted an alternative simple way around large
refactoring and started this discussion. We should definitely try
reworking GUP to cause less contention. I just don't have enough time
ATM to drive that, but would be happy to help with the VMA-locking
parts.

>
> --
> Cheers,
>
> David