Re: [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance

From: Barry Song

Date: Sun Jun 21 2026 - 20:16:18 EST

On Mon, Jun 22, 2026 at 4:49 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Sat, Jun 20, 2026 at 04:48:57PM -0700, Suren Baghdasaryan wrote:
> > Just checking in on the followup plans. IIUC the RFC mentioned will
> > try to implement the solution we discussed at LSFMM: splitting
> > VM_FAULT_RETRY into two flags - one for retrying under per-VMA locks
> > and another one to fallback to mmap_lock.
>
> I continue to hate this idea. I don't believe that those who were
> pushing for it have ever tried to understand the whole fault path.
> It's utterly byzantine.
>
> I defy anyone to make sense of this:
>
> /*
> * NOTE! This will make us return with VM_FAULT_RETRY, but with
> * the fault lock still held. That's how FAULT_FLAG_RETRY_NOWAIT
> * is supposed to work. We have way too many special cases..
> */
> if (vmf->flags & FAULT_FLAG_RETRY_NOWAIT)
> return 0;
>
> *fpin = maybe_unlock_mmap_for_io(vmf, *fpin);
> if (vmf->flags & FAULT_FLAG_KILLABLE) {
> if (__folio_lock_killable(folio)) {
> /*
> * We didn't have the right flags to drop the
> * fault lock, but all fault_handlers only check
> * for fatal signals if we return VM_FAULT_RETRY,
> * so we need to drop the fault lock here and
> * return 0 if we don't have a fpin.
> */
> if (*fpin == NULL)
> release_fault_lock(vmf);
> return 0;
> }
>
> Wed need to simplify the fault path, not add additional complexity.
> Josef has said he wouldn't've done the lock dropping had we had per-VMA
> locks. We should rip it out.

I think you have agreed that, at least for anon vma, we can
keep the current policy, since anon vma is much more volatile
than file vma.
Concurrent page faults and VMA modifications can happen more
often than with file VMAs.

For file vmas, how much code can we actually remove, given that
the first page fault might already be holding mmap_lock?
It could be the case that lock_vma_under_rcu() fails, and then
on the first page fault we end up holding mmap_lock before
retrying. So are we also going to rip out the lock release,
even if it risks holding mmap_lock for a long time?

vma = lock_vma_under_rcu(mm, addr);
if (!vma)
goto lock_mmap;
...
lock_mmap:

vma = lock_mm_and_find_vma(mm, addr, regs);
if (unlikely(!vma)) {
fault = 0;
si_code = SEGV_MAPERR;
goto bad_area;
}

If we still need to keep the page fault retry code there, it
doesn't seem like "ripping out" really reduces complexity in
the page fault code?

Best Regards
Barry