[RFC PATCH] mm: introduce reclaim throttle in MGLRU

From: zhaoyang.huang
Date: Mon Jul 15 2024 - 06:24:45 EST


From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>

Being like legacy LRU management, direct reclaim threads could isolate
over-sized folios and then be rescheduled, which could lead to
unwanted generation update as well as the thrashing things like before.
This commit would like to have direct_reclaim be throttled by judging
the numbers of isolated and inactive folios.

This patch is proved to be helpful by launching 8 costmem(an exe in Android)
concurrently and got no system hang like before.

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
---
mm/vmscan.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2e34de9cd0d4..a7fdad1b2a78 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4481,6 +4481,7 @@ static int isolate_folios(struct lruvec *lruvec, struct scan_control *sc, int sw
int scanned;
int tier = -1;
DEFINE_MIN_SEQ(lruvec);
+ bool stalled = false;

/*
* Try to make the obvious choice first, and if anon and file are both
@@ -4503,6 +4504,14 @@ static int isolate_folios(struct lruvec *lruvec, struct scan_control *sc, int sw
else
type = get_type_to_scan(lruvec, swappiness, &tier);

+ spin_unlock_irq(&lruvec->lru_lock);
+ while (unlikely(too_many_isolated(lruvec_pgdat(lruvec), type, sc))) {
+ if (stalled)
+ return 0;
+ reclaim_throttle(lruvec_pgdat(lruvec), VMSCAN_THROTTLE_ISOLATED);
+ }
+ spin_lock_irq(&lruvec->lru_lock);
+
for (i = !swappiness; i < ANON_AND_FILE; i++) {
if (tier < 0)
tier = get_tier_idx(lruvec, type);
@@ -4550,8 +4559,10 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap
if (list_empty(&list))
return scanned;
retry:
+ __mod_node_page_state(lruvec_pgdat(lruvec), NR_ISOLATED_ANON + type, scanned);
reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false);
sc->nr_reclaimed += reclaimed;
+ __mod_node_page_state(lruvec_pgdat(lruvec), NR_ISOLATED_ANON + type, -scanned);
trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id,
scanned, reclaimed, &stat, sc->priority,
type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
--
2.25.1