Re: [PATCH v2] mm: limit filemap_fault readahead to VMA boundaries

From: David Hildenbrand (Arm)

Date: Mon Apr 27 2026 - 08:44:51 EST

On 4/27/26 05:01, Frederick Mayle wrote:
> When a file mapping covers a strict subset of a file, an access to the
> mapping can trigger readahead of file pages outside the mapped region.
> Readahead is meant to prefetch pages likely to be accessed soon, but
> these pages aren't accessible via the same means, so it fair to say we
> don't have a good indicator they'll be accessed soon. Take an ELF file
> for example: An access to the end of a program's read-only segment isn't
> a sign that nearby file contents will be accessed next (they are likely
> to be mapped discontiguously, or not at all). The pressure from loading
> these pages into the cache can evict more useful pages.
>
> To improve the behavior, make three changes:
>
> * Introduce a new readahead_control field, max_index, as a hard limit on
> the readahead. The existing file_ra_state->size can't be used as a
> limit, it is more of a hint and can be increased by various
> heuristics.
> * Set readahead_control->max_index to the end of the VMA in all of the
> readahead paths that can be triggered from a fault on a file mapping
> (both "sync" and "async" readahead).
> * Limit the read-around range start to the VMA's start.
>
> Note that these changes only affect readahead triggered in the context
> of a fault, they do not affect readahead triggered by read syscalls. If
> a user mixes the two types of accesses, the behavior is expected to be
> the following: if a fault causes readahead and places a PG_readahead
> marker and then a read(2) syscall hits the PG_readahead marker, the
> resulting async readahead *will not* be limited to the VMA end.
> Conversely, if a read(2) syscall places a PG_readahead marker and then a
> fault hits the marker, the async readahead *will* be limited to the VMA
> end.
>
> There is an edge case that the above motivation glosses over: A single
> file mapping might be backed by multiple VMAs. For example, a whole file
> could be mapped RW, then part of the mapping made RO using mprotect.
> This patch would hurt performance of a sequential faulted read of such a
> mapping, the degree depending on how fragmented the VMAs are. A usage
> pattern like that is likely rare and already suffering from sub-optimal
> performance because, e.g., the fragmented VMAs limit the fault-around,
> so each VMA boundary in a sequential faulted read would cause a minor
> fault. Still, this patch would make it worse. See a previous discussion
> of this topic at [1].

I agree that workloads that do a lot of mprotect() magic likely do not depend on
readahead optimizations.

But I'm sure we'll learn quickly if that is not the case :)

--
Cheers,

David