Re: [PATCH] xen/balloon: fix page onlining when populating new zone

From: Juergen Gross
Date: Thu Apr 07 2022 - 04:51:33 EST


On 07.04.22 10:23, David Hildenbrand wrote:
On 06.04.22 15:32, Juergen Gross wrote:
When onlining a new memory page in a guest the Xen balloon driver is
adding it to the ballooned pages instead making it available to be
used immediately. This is meant to enable to add a new upper memory
limit to a guest via hotplugging memory, without having to assign the
new memory in one go.

In case the upper memory limit will be raised above 4G, the new memory
will populate the ZONE_NORMAL memory zone, which wasn't populated
before. The newly populated zone won't be added to the list of zones
looked at by the page allocator though, as only zones with available
memory are being added, and the memory isn't yet available as it is
ballooned out.

I think we just recently discussed these corner cases on the -mm list.

Indeed.

The issue is having effectively populated zones without manages pages
because everything is inflated in a balloon.

Correct.

That can theoretically also happen when managing to fully inflate the
balloon in one zone and then, somehow, the zones get rebuilt.

I think you are right. I didn't think of that scenario.

build_zonerefs_node() documents "Add all populated zones of a node to
the zonelist" but checks for managed zones, which is wrong.

See https://lkml.kernel.org/r/20220201070044.zbm3obsoimhz3xd3@master

I found commit 6aa303defb7454 which introduced this test. I thought
it was needed due to the problem this commit tried to solve. Maybe I
was wrong and that commit shouldn't have changed the condition when
building the zonelist, but just the ones in the allocation paths.



This will result in the new memory being assigned to the guest, but
without the allocator being able to use it.

When running as a PV guest the situation is even worse: when having
been started with less memory than allowed, and the upper limit being
lower than 4G, ballooning up will have the same effect as hotplugging
new memory. This is due to the usage of the zone device functionality
since commit 9e2369c06c8a ("xen: add helpers to allocate unpopulated
memory") for creating mappings of other guest's pages, which as a side
effect is being used for PV guest ballooning, too.

Fix this by checking in xen_online_page() whether the new memory page
will be the first in a new zone. If this is the case, add another page
to the balloon and use the first memory page of the new chunk as a
replacement for this now ballooned out page. This will result in the
newly populated zone containing one page being available for the page
allocator, which in turn will lead to the zone being added to the
allocator.

This somehow feels like a hack for something that should be handled in
the core instead :/

Okay, I'll rework the patch (better wording might be: replace) to switch
build_zonerefs_node() to use populated_zone() instead of managed_zone().


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature