Re: [PATCH] xen/balloon: fix page onlining when populating new zone

From: Wei Yang
Date: Fri Apr 08 2022 - 19:16:45 EST


On Thu, Apr 07, 2022 at 11:00:33AM +0200, David Hildenbrand wrote:
>On 07.04.22 10:50, Juergen Gross wrote:
>> On 07.04.22 10:23, David Hildenbrand wrote:
>>> On 06.04.22 15:32, Juergen Gross wrote:
>>>> When onlining a new memory page in a guest the Xen balloon driver is
>>>> adding it to the ballooned pages instead making it available to be
>>>> used immediately. This is meant to enable to add a new upper memory
>>>> limit to a guest via hotplugging memory, without having to assign the
>>>> new memory in one go.
>>>>
>>>> In case the upper memory limit will be raised above 4G, the new memory
>>>> will populate the ZONE_NORMAL memory zone, which wasn't populated
>>>> before. The newly populated zone won't be added to the list of zones
>>>> looked at by the page allocator though, as only zones with available
>>>> memory are being added, and the memory isn't yet available as it is
>>>> ballooned out.
>>>
>>> I think we just recently discussed these corner cases on the -mm list.
>>
>> Indeed.
>>
>>> The issue is having effectively populated zones without manages pages
>>> because everything is inflated in a balloon.
>>
>> Correct.
>>
>>> That can theoretically also happen when managing to fully inflate the
>>> balloon in one zone and then, somehow, the zones get rebuilt.
>>
>> I think you are right. I didn't think of that scenario.
>>
>>> build_zonerefs_node() documents "Add all populated zones of a node to
>>> the zonelist" but checks for managed zones, which is wrong.
>>>
>>> See https://lkml.kernel.org/r/20220201070044.zbm3obsoimhz3xd3@master
>>
>> I found commit 6aa303defb7454 which introduced this test. I thought
>> it was needed due to the problem this commit tried to solve. Maybe I
>> was wrong and that commit shouldn't have changed the condition when
>> building the zonelist, but just the ones in the allocation paths.
>
>In regard to kswapd, that is currently being worked on via
>
>https://lkml.kernel.org/r/20220329010901.1654-2-richard.weiyang@xxxxxxxxx
>

Thanks, David

Do you think it is the right time to repost the original fix?

>--
>Thanks,
>
>David / dhildenb

--
Wei Yang
Help you, Help me