Re: [PATCH] mm: fix finish_fault() handling for large folios
From: Matthew Wilcox
Date: Wed Feb 26 2025 - 11:28:47 EST
On Wed, Feb 26, 2025 at 04:42:46PM +0100, David Hildenbrand wrote:
> On 26.02.25 15:03, Matthew Wilcox wrote:
> > On Wed, Feb 26, 2025 at 06:48:15AM -0500, Brian Geffon wrote:
> > > When handling faults for anon shmem finish_fault() will attempt to install
> > > ptes for the entire folio. Unfortunately if it encounters a single
> > > non-pte_none entry in that range it will bail, even if the pte that
> > > triggered the fault is still pte_none. When this situation happens the
> > > fault will be retried endlessly never making forward progress.
> > >
> > > This patch fixes this behavior and if it detects that a pte in the range
> > > is not pte_none it will fall back to setting just the pte for the
> > > address that triggered the fault.
> >
> > Surely there's a similar problem in do_anonymous_page()?
>
> I recall we handle it in there correctly the last time I stared at it.
>
> We check pte_none to decide which folio size we can allocate (including
> basing the decision on other factors like VMA etc), and after retaking the
> PTL, we recheck vmf_pte_changed / pte_range_none() to make sure there were
> no races.
Ah, so then we'll retry and allocate a folio of the right size the next
time? Rather than the shmem case where the folio is already allocated
and we can't change that?