Re: mm: 5.16 regression: reclaim_throttle leads to stall in near-OOM conditions

From: Mel Gorman
Date: Fri Nov 26 2021 - 11:26:24 EST


On Sat, Nov 27, 2021 at 01:06:31AM +0900, Alexey Avramov wrote:
> >Please let me know if this version works any better
>
> It's better, but not the same as 5.15.
>
> Sometimes stall is short, sometimes is long (3 `tail /dev/zero` test):
>

It's somewhat expected. If the system is able to make some sort of
progress and kswapd is active, it'll throttle until progress is
impossible. It'll be somewhat variable how long it can keep making
progress be it discarding page cache or writing to swap but it'll only
OOM when the system is truly OOM.

Might be worth trying the patch below on top. It will delay throttling
for longer with the caveat that CPU usage due to reclaim when very low
on memory may be excessive.

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 176ddd28df21..167ea4f324a8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3404,8 +3404,8 @@ static void consider_reclaim_throttle(pg_data_t *pgdat, struct scan_control *sc)
if (current_is_kswapd())
return;

- /* Throttle if making no progress at high prioities. */
- if (sc->priority < DEF_PRIORITY - 2 && !sc->nr_reclaimed)
+ /* Throttle if making no progress at high priority. */
+ if (sc->priority == 1 && !sc->nr_reclaimed)
reclaim_throttle(pgdat, VMSCAN_THROTTLE_NOPROGRESS);
}