Re: [PATCH] mm: vmscan: account for free pages to prevent infinite Loop in throttle_direct_reclaim()
From: Andrew Morton
Date: Sat Nov 30 2024 - 21:40:24 EST
On Sun, 1 Dec 2024 01:12:34 +0900 Seiji Nishikawa <snishika@xxxxxxxxxx> wrote:
> The kernel hangs due to a task stuck in throttle_direct_reclaim(),
> caused by a node being incorrectly deemed balanced despite pressure in
> certain zones, such as ZONE_NORMAL. This issue arises from
> zone_reclaimable_pages() returning 0 for zones without reclaimable file-
> backed or anonymous pages, causing zones like ZONE_DMA32 with sufficient
> free pages to be skipped.
>
> The lack of swap or reclaimable pages results in ZONE_DMA32 being
> ignored during reclaim, masking pressure in other zones. Consequently,
> pgdat->kswapd_failures remains 0 in balance_pgdat(), preventing fallback
> mechanisms in allow_direct_reclaim() from being triggered, leading to an
> infinite loop in throttle_direct_reclaim().
>
> This patch modifies zone_reclaimable_pages() to account for free pages
> (NR_FREE_PAGES) when no other reclaimable pages exist. This ensures
> zones with sufficient free pages are not skipped, enabling proper
> balancing and reclaim behavior.
We'll want to backport a fix for this into -stable kernels. For that
it's best to be able to identify a suitable Fixes: target, to tell
others whether their kernel needs the fix. Are you able to help
identify that commit?
Thanks.
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -374,7 +374,14 @@ unsigned long zone_reclaimable_pages(struct zone *zone)
> if (can_reclaim_anon_pages(NULL, zone_to_nid(zone), NULL))
> nr += zone_page_state_snapshot(zone, NR_ZONE_INACTIVE_ANON) +
> zone_page_state_snapshot(zone, NR_ZONE_ACTIVE_ANON);
> -
> + /*
> + * If there are no reclaimable file-backed or anonymous pages,
> + * ensure zones with sufficient free pages are not skipped.
> + * This prevents zones like DMA32 from being ignored in reclaim
> + * scenarios where they can still help alleviate memory pressure.
> + */
> + if (nr == 0)
> + nr = zone_page_state_snapshot(zone, NR_FREE_PAGES);
> return nr;
> }
>
> --
> 2.47.0