[PATCH mm-unstable v1 1/2] mm/mglru: fix div-by-zero in vmpressure_calc_level()

From: Yu Zhao
Date: Thu Jul 11 2024 - 15:20:08 EST


evict_folios() uses a second pass to reclaim folios that have gone
through page writeback and become clean before it finishes the first
pass, since folio_rotate_reclaimable() cannot handle those folios due
to the isolation.

The second pass tries to avoid potential double counting by deducting
scan_control->nr_scanned. However, this can result in underflow of
nr_scanned, under a condition where shrink_folio_list() does not
increment nr_scanned, i.e., when folio_trylock() fails.

The underflow can cause the divisor, i.e., scale=scanned+reclaimed in
vmpressure_calc_level(), to become zero, resulting in the following
crash:

[exception RIP: vmpressure_work_fn+101]
process_one_work at ffffffffa3313f2b

Since scan_control->nr_scanned has no established semantics, the
potential double counting has minimal risks. Therefore, fix the
problem by not deducting scan_control->nr_scanned in evict_folios().

Reported-by: Wei Xu <weixugc@xxxxxxxxxx>
Fixes: 359a5e1416ca ("mm: multi-gen LRU: retry folios written back while isolated")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx>
---
mm/vmscan.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0761f91b407f..6403038c776e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4597,7 +4597,6 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap

/* retry folios that may have missed folio_rotate_reclaimable() */
list_move(&folio->lru, &clean);
- sc->nr_scanned -= folio_nr_pages(folio);
}

spin_lock_irq(&lruvec->lru_lock);
--
2.45.2.993.g49e7a77208-goog