Re: [PATCH 1/2] mm/mglru: only clear kswapd_failures if reclaimable

From: Wei Xu
Date: Mon Oct 14 2024 - 19:41:23 EST


On Mon, Oct 14, 2024 at 4:25 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, 14 Oct 2024 22:12:11 +0000 Wei Xu <weixugc@xxxxxxxxxx> wrote:
>
> > lru_gen_shrink_node() unconditionally clears kswapd_failures, which
> > can prevent kswapd from sleeping and cause 100% kswapd cpu usage even
> > when kswapd repeatedly fails to make progress in reclaim.
> >
> > Only clear kswap_failures in lru_gen_shrink_node() if reclaim makes
> > some progress, similar to shrink_node().
>
> That sounds bad. What triggers this? Can you suggest why it has just
> bee discovered, after 1.5 years? And should the fix be backported into
> -stable kernels?
>

I happened to run into this problem in one of my tests recently. It
requires a combination of several conditions: The allocator needs to
allocate a right amount of pages such that it can wake up kswapd
without itself being OOM killed; there is no memory for kswapd to
reclaim (My test disables swap and cleans page cache first); no other
process frees enough memory at the same time.

I think the fix is a good candidate for stable kernels.