Re: [PATCH v2 5/5] mm/vmscan: flush TLB for every 31 folios evictions

From: Zhang Peng

Date: Sun Mar 29 2026 - 23:19:36 EST


On Thu, Mar 26, 2026 at 12:40:50PM +0000, Pedro Falcato wrote:
> On Thu, Mar 26, 2026 at 04:36:21PM +0800, Zhang Peng wrote:
> > +    folio_batch_reinit(fbatch);
> > +    for (i = 0; i < count; ++i) {
> > +        folio = fbatch->folios[i];
> > +        if (!folio_trylock(folio)) {
> > +            list_add(&folio->lru, ret_folios);
> > +            continue;
> > +        }
> > +
> > +        if (folio_test_writeback(folio) || folio_test_lru(folio) ||
>
> If PG_lru is set here, we're in a world of trouble as we're actively using
> folio->lru. I don't think it's possible for it to be set, as isolating folios
> clears lru, and refcount bump means the folio cannot be reused or reinserted
> back on the LRU. So perhaps:
>         VM_WARN_ON_FOLIO(folio_test_lru(folio), folio);

Agreed. The folio was isolated from LRU in shrink_folio_list()'s caller
with PG_lru cleared, and we hold a reference throughout. There's no path
that can re-set PG_lru while we hold the folio. Will replace the
folio_test_lru() check with VM_WARN_ON_FOLIO.

> > +            folio_mapped(folio) || folio_maybe_dma_pinned(folio)) {
> > +            folio_unlock(folio);
> > +            list_add(&folio->lru, ret_folios);
> > +            continue;
> > +        }
> > +
> > +        folio_batch_add(fbatch, folio);
> > +    }
> > +
> > +    i = 0;
> > +    count = folio_batch_count(fbatch);
> > +    if (!count)
> > +        return;
> > +    /* One TLB flush for the batch */
> > +    try_to_unmap_flush_dirty();
> > +    for (i = 0; i < count; ++i) {
> > +        folio = fbatch->folios[i];
> > +        pageout_one(folio, ret_folios, free_folios, sc, stat, plug,
> > +                folio_list);
>
> Would be lovely if we could pass the batch down to the swap layer.

Agreed, that would be the logical next step. For now I kept the scope
small to just batch the TLB flush, but submitting swap IO in batches
could further reduce per-folio overhead. Will look into that as a
follow-up.

> > +    }
> > +    folio_batch_reinit(fbatch);
>
> The way you keep reinitializing fbatch is a bit confusing.
> Probably worth a comment or two (or kdocs for pageout_batch documenting
> that the folio batch is reset, etc).

Will add kdocs. The first reinit saves the original folios into local
variables and resets the batch for reuse as a "still-valid" set. The
second reinit cleans up after pageout_one() has consumed them. Will make
this two-phase usage explicit in the documentation.

> > +            folio_unlock(folio);
>
> Why is the folio unlocked? I don't see the need to take the lock trip twice.
> Is there something I'm missing?

We should not hold the folio lock longer than necessary. The folio sits
in flush_folios while the main loop continues scanning the remaining
folios - accumulating a full batch means processing up to 31 more folios
through trylock, unmap, swap allocation etc.

During that window the folio is still in swap cache and findable by
other CPUs. For example, do_swap_page() can look it up via
swap_cache_get_folio() and then block at folio_lock_or_retry() waiting
for us to finish accumulating. That is a direct stall on a process
trying to fault in its own memory.

Unlocking here and relocking in pageout_batch() is the same pattern used
by the demote path a few lines above.