Re: [PATCH v1] mm/vmscan: mitigate spurious kswapd_failures reset from direct reclaim

Next message: Ben Horgan: "Re: [PATCH v2 18/45] arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls()"
Previous message: Ben Horgan: "Re: [PATCH v2 15/45] arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation"
In reply to: Michal Hocko: "Re: [PATCH v1] mm/vmscan: mitigate spurious kswapd_failures reset from direct reclaim"
Next in thread: Michal Hocko: "Re: [PATCH v1] mm/vmscan: mitigate spurious kswapd_failures reset from direct reclaim"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Jiayuan Chen

Date: Tue Jan 06 2026 - 06:19:33 EST

January 6, 2026 at 17:49, "Michal Hocko" <mhocko@xxxxxxxx mailto:mhocko@xxxxxxxx?to=%22Michal%20Hocko%22%20%3Cmhocko%40suse.com%3E > wrote:

>
> On Tue 06-01-26 05:25:42, Jiayuan Chen wrote:
>
> >
> > That said, I believe this patch is still a valid fix on its own - resetting kswapd_failures
> > when the node is not actually balanced doesn't seem like correct behavior regardless of the
> > broader context.
> >
> Originally I was more inclined to opt out memcg reclaim from reseting
> kswapd retry counter but the more I am thiking about that the more your
> patch makes sense to me.
>
> The reason being that it handles both memcg and global direct reclaims
> in the same way which makes the logic easier to follow. Afterall the
> primary purpose is to resurrect kswapd after we can see there is a
> better chance to reclaim something for kswapd. Until that moment direct
> reclaim is the only reclaim mechanism.
>
> Relying on pgdat_balanced might lead to re-enabling kswapd way much
> later while memory reclaim would be still mostly direct reclaim bound -
> thus increase allocation latencies.
> If we wanted to do better we would need to evaluate recent
> refaults/thrashing behavior but even then I am not sure we can make a
> good cut off.
>
> So in the end pgdat_balanced approach seems worth trying and see whether
> this could cause any corner cases.

Thanks Michal.

Regarding the allocation latency concern - we are already
in the direct reclaim slowpath, so a little extra overhead
from the pgdat_balanced check should be negligible.

> --
> Michal Hocko
> SUSE Labs
>