Re: [PATCH v2 2/4] mm: replace exec_folio_order() with generic preferred_exec_order()

From: Jan Kara

Date: Fri Mar 20 2026 - 10:47:43 EST


On Fri 20-03-26 06:58:52, Usama Arif wrote:
> Replace the arch-specific exec_folio_order() hook with a generic
> preferred_exec_order() that dynamically computes the readahead folio
> order for executable memory. It targets min(PMD_ORDER, 2M) as the
> maximum, which optimally gives the right answer for contpte (arm64),
> PMD mapping (x86, arm64 4K), and architectures with smaller PMDs
> (s390 1M). It adapts at runtime based on:
>
> - VMA size: caps the order so folios fit within the mapping
> - Memory pressure: steps down the order when the local node's free
> memory is below the high watermark for the requested order
>
> This avoids over-allocating on memory-constrained systems while still
> requesting the optimal order when memory is plentiful.
>
> Since exec_folio_order() is no longer needed, remove the arm64
> definition and the generic default from pgtable.h.
>
> Signed-off-by: Usama Arif <usama.arif@xxxxxxxxx>
...
> +static unsigned int preferred_exec_order(struct vm_area_struct *vma)
> +{
> + int order;
> + unsigned long vma_len = vma_pages(vma);
> + struct zone *zone;
> + gfp_t gfp;
> +
> + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
> + return 0;
> +
> + /* Cap at min(PMD_ORDER, 2M) */
> + order = min(HPAGE_PMD_ORDER, ilog2(SZ_2M >> PAGE_SHIFT));
> +
> + /* Don't request folios larger than the VMA */
> + order = min(order, ilog2(vma_len));

Hum, as far as I'm checking page_cache_ra_order() used in
do_sync_mmap_readahead(), ra->order is the preferred order but it will be
trimmed down to fit both within the file and within ra->size. And ra->size
is set for the readahead to fit within the vma so I don't think any order
trimming based on vma length is needed in this place?

> + /* Step down under memory pressure */
> + gfp = mapping_gfp_mask(vma->vm_file->f_mapping);
> + zone = first_zones_zonelist(node_zonelist(numa_node_id(), gfp),
> + gfp_zone(gfp), NULL)->zone;
> + if (zone) {
> + while (order > 0 &&
> + !zone_watermark_ok(zone, order,
> + high_wmark_pages(zone), 0, 0))
> + order--;
> + }

It looks wrong for this logic to be here. Trimming order based on memory
pressure makes sense (and we've already got reports that on memory limited
devices large order folios in the page cache have too big memory overhead
so we'll likely need to handle that for page cache allocations in general)
but IMHO it belongs to page_cache_ra_order() or some other common place
like that.

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR