Re: [PATCH] mm/swap_state: remove unnecessary lru_add_drain() from readahead
From: Shakeel Butt
Date: Mon Jun 08 2026 - 13:22:35 EST
On Mon, Jun 08, 2026 at 07:32:42AM -0700, Usama Arif wrote:
> swap_cluster_readahead() and swap_vma_readahead() end the readahead
> loop with an explicit lru_add_drain() call. That drain is a leftover
> from 2.6.12 era code and serves no functional purpose for the callers:
>
> - do_swap_page() ignores LRU residency for the readahead folios;
> it only needs the target folio it called swapin_readahead() for,
> and if the write-fault path needs the target folio on the LRU to count
> references accurately, it runs its own lru_add_drain() at the
> wp_can_reuse_anon_folio() and do_swap_page() sites.
>
> - shmem_swapin_cluster() immediately locks the returned folio, waits
> for writeback, then operates on it - LRU residency of either the target
> or the readahead folios is irrelevant.
>
> - try_to_unuse() likewise locks the folio and calls unuse_pte() without
> depending on LRU presence.
>
> Folios newly added to the swap cache by the readahead loop sit in
> the per-CPU LRU folio_batch and will be drained naturally as the
> batch fills (FOLIO_BATCH_SIZE),by the next reclaim/compaction
> lru_add_drain_all() and so on. The unconditional drain only
> synchronously flushes a partial batch and forces contention on
> lruvec_lock.
>
> On a 176-CPU production host running a memory-pressured workload, this
> path was observed to call folio_batch_move_lru() from
> swap_cluster_readahead() ~28K/min, a very large source of LRU lock
> traffic.
>
> This is a direct continuation of the cleanup started in commit
> 1aa43598c03b ("mm: remove unnecessary calls to lru_add_drain") which
> removed the equivalent drain from free_pages_and_swap_cache() with
> the same rationale. A detailed reasoning for this is present in [1].
>
> Remove both drains.
>
> [1] https://lore.kernel.org/all/dca2824e8e88e826c6b260a831d79089b5b9c79d.camel@xxxxxxxxxxx/T/#u
>
> Signed-off-by: Usama Arif <usama.arif@xxxxxxxxx>
Acked-by: Shakeel Butt <shakeel.butt@xxxxxxxxx>
Thanks for pushing this. JP was also looking into LRU lock contention sources.
Particularly we lack visibiluty into the lru_add_drain_all() callers. The idea
was to add tracepoints to tracks such callers. (Just nudging you towards it :P)