Re: [PATCH v2] mm: limit filemap_fault readahead to VMA boundaries

From: Kalesh Singh

Date: Mon Apr 27 2026 - 12:39:03 EST

On Mon, Apr 27, 2026 at 5:41 AM 'David Hildenbrand (Arm)' via
android-mm <android-mm@xxxxxxxxxx> wrote:
>
> On 4/27/26 05:01, Frederick Mayle wrote:
> > When a file mapping covers a strict subset of a file, an access to the
> > mapping can trigger readahead of file pages outside the mapped region.
> > Readahead is meant to prefetch pages likely to be accessed soon, but
> > these pages aren't accessible via the same means, so it fair to say we
> > don't have a good indicator they'll be accessed soon. Take an ELF file
> > for example: An access to the end of a program's read-only segment isn't
> > a sign that nearby file contents will be accessed next (they are likely
> > to be mapped discontiguously, or not at all). The pressure from loading
> > these pages into the cache can evict more useful pages.
> >
> > To improve the behavior, make three changes:
> >
> > * Introduce a new readahead_control field, max_index, as a hard limit on
> > the readahead. The existing file_ra_state->size can't be used as a
> > limit, it is more of a hint and can be increased by various
> > heuristics.
> > * Set readahead_control->max_index to the end of the VMA in all of the
> > readahead paths that can be triggered from a fault on a file mapping
> > (both "sync" and "async" readahead).
> > * Limit the read-around range start to the VMA's start.
> >
> > Note that these changes only affect readahead triggered in the context
> > of a fault, they do not affect readahead triggered by read syscalls. If
> > a user mixes the two types of accesses, the behavior is expected to be
> > the following: if a fault causes readahead and places a PG_readahead
> > marker and then a read(2) syscall hits the PG_readahead marker, the
> > resulting async readahead *will not* be limited to the VMA end.
> > Conversely, if a read(2) syscall places a PG_readahead marker and then a
> > fault hits the marker, the async readahead *will* be limited to the VMA
> > end.
> >
> > There is an edge case that the above motivation glosses over: A single
> > file mapping might be backed by multiple VMAs. For example, a whole file
> > could be mapped RW, then part of the mapping made RO using mprotect.
> > This patch would hurt performance of a sequential faulted read of such a
> > mapping, the degree depending on how fragmented the VMAs are. A usage
> > pattern like that is likely rare and already suffering from sub-optimal
> > performance because, e.g., the fragmented VMAs limit the fault-around,
> > so each VMA boundary in a sequential faulted read would cause a minor
> > fault. Still, this patch would make it worse. See a previous discussion
> > of this topic at [1].
>
> I agree that workloads that do a lot of mprotect() magic likely do not depend on
> readahead optimizations.
>
> But I'm sure we'll learn quickly if that is not the case :)

Hi David,

There is already this limit for the exec VMAs, so perhaps these use
cases are in fact rare enough; but we'll need to see ...

https://lore.kernel.org/all/20250609092729.274960-6-ryan.roberts@xxxxxxx/

Frederick, could we also now remove that logic (EXEC mappings)? Maybe
in a follow up patch.

For this patch: Reviewed-by: Kalesh Singh <kaleshsingh@xxxxxxxxxx>

Thanks,
Kalesh

>
> --
> Cheers,
>
> David
>
> --
> You received this message because you are subscribed to the Google Groups "android-mm" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to android-mm+unsubscribe@xxxxxxxxxx.
> To view this discussion visit https://groups.google.com/a/google.com/d/msgid/android-mm/c7f94ce6-1dfb-420a-b073-d86abeff1f76%40kernel.org.