Re: [PATCH v2 08/12] mm/mglru: simplify and improve dirty writeback handling

From: Kairui Song

Date: Thu Apr 02 2026 - 07:45:26 EST

On Wed, Apr 01, 2026 at 04:37:14PM +0800, Shakeel Butt wrote:
> On Sun, Mar 29, 2026 at 03:52:34AM +0800, Kairui Song via B4 Relay wrote:
> > From: Kairui Song <kasong@xxxxxxxxxxx>
> >
> > The current handling of dirty writeback folios is not working well for
> > file page heavy workloads: Dirty folios are protected and move to next
> > gen upon isolation of getting throttled or reactivation upon pageout
> > (shrink_folio_list).
> >
> > This might help to reduce the LRU lock contention slightly, but as a
> > result, the ping-pong effect of folios between head and tail of last two
> > gens is serious as the shrinker will run into protected dirty writeback
> > folios more frequently compared to activation. The dirty flush wakeup
> > condition is also much more passive compared to active/inactive LRU.
> > Active / inactve LRU wakes the flusher if one batch of folios passed to
> > shrink_folio_list is unevictable due to under writeback, but MGLRU
> > instead has to check this after the whole reclaim loop is done, and then
> > count the isolation protection number compared to the total reclaim
> > number.
>
> I was just ranting about this on Baolin's patch and thanks for unifying them.
>
> >
> > And we previously saw OOM problems with it, too, which were fixed but
> > still not perfect [1].
> >
> > So instead, just drop the special handling for dirty writeback, just
> > re-activate it like active / inactive LRU. And also move the dirty flush
> > wake up check right after shrink_folio_list. This should improve both
> > throttling and performance.
>
> Please divide this patch into two separate ones. One for moving the flusher
> waker (& v1 throttling) within evict_folios() and second the above heuristic of
> dirty writeback.

OK, but throttling is not handled by this commit, it handled by the last
commit. And using the common routine in shrink_folio_list and activate the
folio is suppose to be done before moving the flusher wakeup and throttle,
as I observed some inefficient reclaim or over aggressive / passive if we
don't do that first. We will run into these folios again and again very
frequently and shrink_folio_list also have better dirty / writeback
detection.

I tested these two changes separately again in case I remembered it
wrongly, using the MongoDB YCSB case:

Before this series or commit, it's similar:
Throughput(ops/sec), 63414.891930455

Apply only the remove folio_inc_gen and use shrink_folio_list
to active folio part in this commit:
Throughput(ops/sec), 68580.83394294075

Skip the folio_inc_gen part but apply other part:
Throughput(ops/sec), 61614.29451632779

After the two fixes together (apply this commit fully):
Throughput(ops/sec), 80857.08510208207

And the whole series:
Throughput(ops/sec), 79760.71784646061

The test is a bit noisy, but after the whole series the throttling
seems is already slightly slowing down the workload, still accetable
IMO, this is also why activate the folios here is a good idea or we
will run into problematic throttling.

I think this can be further improved later, as I observed previously with
the LFU alike rework I mentioned, it will help promote folios more
proactively to younger gen and it will have a even better performance:
https://lore.kernel.org/linux-mm/CAMgjq7BoekNjg-Ra3C8M7=8=75su38w=HD782T5E_cxyeCeH_g@xxxxxxxxxxxxxx/

For now I can split this into two in V3, first a commit to use the
common routine for activating the folio, then move then fluster wakeup.