Re: [PATCH v2] mm/rmap: fix incorrect pte restoration for lazyfree folios
From: Barry Song
Date: Mon Mar 02 2026 - 01:38:51 EST
On Mon, Mar 2, 2026 at 12:26 PM Dev Jain <dev.jain@xxxxxxx> wrote:
[...]
> >>
> >> diff --git a/mm/rmap.c b/mm/rmap.c
> >> index bff8f222004e4..fb64829913052 100644
> >> --- a/mm/rmap.c
> >> +++ b/mm/rmap.c
> >> @@ -1955,7 +1955,17 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
> >> if (userfaultfd_wp(vma))
> >> return 1;
> >>
> >> - return folio_pte_batch(folio, pvmw->pte, pte, max_nr);
> >> + if (!folio_test_anon(folio))
> >> + return folio_pte_batch(folio, pvmw->pte, pte, max_nr);
> >> +
> >> + /*
> >> + * For anon folios, if unmap fails, we need to restore the ptes.
> >> + * To avoid accidentally upgrading write permissions for ptes that
> >> + * were not originally writable, and to avoid losing the soft-dirty
> >> + * bit, use the appropriate FPB flags.
> >> + */
> >> + return folio_pte_batch_flags(folio, vma, pvmw->pte, &pte, max_nr,
> >> + FPB_RESPECT_WRITE | FPB_RESPECT_SOFT_DIRTY);
> >
> > Do we really need to differentiate between file and anon?
> > I’d rather just return unconditionally by removing the
> > if (!folio_test_anon(folio)) check above.
>
> File mappings don't need preservation of writable bit. I really think
> we should differentiate this. It improves understanding of code by
> telling the reader that after clearing ptes for file mappings and
> losing the writable bit, the correctness is ensured in the fault handler.
I understand this is the case for file-backed folios in
the unmap path. However, I’d prefer to keep things
simple. If file folios with inconsistent PTEs account
for only a very small proportion, it may not be worth
worrying about.
On the other hand, when we extend the current batching
code to try_to_migrate_one(), the situation could be
different from the unmap path.
Anyway, I don’t feel strongly about this. If you, David,
Lorenzo, and others think it’s better to differentiate
between file and anon, I’m fine with that.
> >
> > If we do want to keep two branches, why not use a flag variant instead?
> >
> > flag = 0;
> >
> > /* for anon folios .... */
> > if (folio_test_anon(folio))
> > flag = FPB_RESPECT_WRITE | FPB_RESPECT_SOFT_DIRTY;
> > return folio_pte_batch_flags(...., flag);
>
> Yep this is better.
Yep, that would be more readable, as it makes it clear
that the difference is the flag.
Thanks
Barry