Re: [PATCH] mm/page_alloc: add zone to zonelist if populated

From: Wei Yang
Date: Tue Mar 15 2022 - 20:40:15 EST


On Thu, Feb 03, 2022 at 10:27:11AM +0100, Michal Hocko wrote:
>On Thu 03-02-22 02:00:22, Wei Yang wrote:
>> During memory hotplug, when online/offline a zone, we need to rebuild
>> the zonelist for all nodes. Current behavior would lose a valid zone in
>> zonelist since only pick up managed_zone.
>>
>> There are two cases for a zone with memory but still !managed.
>>
>> * all pages were allocated via memblock
>> * all pages were taken by ballooning / virtio-mem
>>
>> This state maybe temporary, since both of them may release some memory.
>> Then it end up with a managed zone not in zonelist.
>>
>> This is introduced in 'commit 6aa303defb74 ("mm, vmscan: only allocate
>> and reclaim from zones with pages managed by the buddy allocator")'.
>> This patch restore the behavior.
>
>It has been introduced to fix a problem described in the the changelog
>(FADUMP configuration making kswapd hogging a cpu). You are not
>explaining why the original issue is not possible after this change.
>

After some reading, here is what I find.

To prevent this problem again, we need to make sure reclaim only applies to
managed_zones. After go through the code, there are only two places we don't
guarantee this when iterating zone.

1. skip_throttle_noprogress()
2. throttle_direct_reclaim()

After we make sure vmscan only reclaim on managed_zone, the problem won't be
possible after this change.

BTW, there are another two places use for_each_zone_zonelist_nodemask(). It's
ok to not check managed_zone, since actually they are doing a node base
iteration.

If this looks good to you, I would adjust the changelog and send two patches
to fix the above two places.

>I also think that this is more of theoretical issue than anything that
>is a real life concern. It is good to state that in the changelog as
>well.
>
>That being said I am not against the change but the changelog needs more
>explanation before I can ack it.
>
>> Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx>
>> CC: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>> CC: David Hildenbrand <david@xxxxxxxxxx>
>> Fixes: 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator")
>
>Fixes tag should be really used only if the referenced commit breaks
>something. I do not really see this to be the case here.
>
>Thanks!
>

--
Wei Yang
Help you, Help me