Re: maybe revert commit c275a57f5ec3 "xen/balloon: Set balloon's initial state to number of existing RAM pages"

From: Boris Ostrovsky
Date: Mon Mar 27 2017 - 21:58:30 EST




On 03/27/2017 03:57 PM, Dan Streetman wrote:
On Fri, Mar 24, 2017 at 9:33 PM, Boris Ostrovsky
<boris.ostrovsky@xxxxxxxxxx> wrote:


I think we can all agree that the *ideal* situation would be, for the
balloon driver to not immediately hotplug memory so it can add 11 more
pages, so maybe I just need to figure out why the balloon driver
thinks it needs 11 more pages, and fix that.



How does the new memory appear in the guest? Via online_pages()?

Or is ballooning triggered from watch_target()?

yes, it's triggered from watch_target() which then calls
online_pages() with the new memory. I added some debug (all numbers
are in hex):

[ 0.500080] xen:balloon: Initialising balloon driver
[ 0.503027] xen:balloon: balloon_init: current/target pages 1fff9d
[ 0.504044] xen_balloon: Initialising balloon driver
[ 0.508046] xen_balloon: watch_target: new target 800000 kb
[ 0.508046] xen:balloon: balloon_set_new_target: target 200000
[ 0.524024] xen:balloon: current_credit: target pages 200000
current pages 1fff9d credit 63
[ 0.567055] xen:balloon: balloon_process: current_credit 63
[ 0.568005] xen:balloon: reserve_additional_memory: adding memory
resource for 8000 pages
[ 3.694443] online_pages: pfn 210000 nr_pages 8000 type 0
[ 3.701072] xen:balloon: current_credit: target pages 200000
current pages 1fff9d credit 63
[ 3.701074] xen:balloon: balloon_process: current_credit 63
[ 3.701075] xen:balloon: increase_reservation: nr_pages 63
[ 3.701170] xen:balloon: increase_reservation: done, current_pages 1fffa8
[ 3.701172] xen:balloon: current_credit: target pages 200000
current pages 1fffa8 credit 58
[ 3.701173] xen:balloon: balloon_process: current_credit 58
[ 3.701173] xen:balloon: increase_reservation: nr_pages 58
[ 3.701180] xen:balloon: increase_reservation: XENMEM_populate_physmap err 0
[ 5.708085] xen:balloon: current_credit: target pages 200000
current pages 1fffa8 credit 58
[ 5.708088] xen:balloon: balloon_process: current_credit 58
[ 5.708089] xen:balloon: increase_reservation: nr_pages 58
[ 5.708106] xen:balloon: increase_reservation: XENMEM_populate_physmap err 0
[ 9.716065] xen:balloon: current_credit: target pages 200000
current pages 1fffa8 credit 58
[ 9.716068] xen:balloon: balloon_process: current_credit 58
[ 9.716069] xen:balloon: increase_reservation: nr_pages 58
[ 9.716087] xen:balloon: increase_reservation: XENMEM_populate_physmap err 0


and that continues forever at the max interval (32), since
max_retry_count is unlimited. So I think I understand things now;
first, the current_pages is set properly based on the e820 map:

$ dmesg|grep -i e820
[ 0.000000] e820: BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000efffffff] usable
[ 0.000000] BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000020fffffff] usable
[ 0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[ 0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[ 0.000000] e820: last_pfn = 0x210000 max_arch_pfn = 0x400000000
[ 0.000000] e820: last_pfn = 0xf0000 max_arch_pfn = 0x400000000
[ 0.000000] e820: [mem 0xf0000000-0xfbffffff] available for PCI devices
[ 0.528007] e820: reserve RAM buffer [mem 0x0009e000-0x0009ffff]
ubuntu@ip-172-31-60-112:~$ printf "%x\n" $[ 0x210000 - 0x100000 +
0xf0000 - 0x100 + 0x9e - 1 ]
1fff9d


then, the xen balloon notices its target has been set to 200000 by the
hypervisor. That target does account for the hole at 0xf0000 to
0x100000, but it doesn't account for the hole at 0xe0 to 0x100 ( 0x20
pages), nor the hole at 0x9e to 0xa0 ( 2 pages ), nor the unlisted
hole (that the kernel removes) at 0xa0 to 0xe0 ( 0x40 pages). That's
0x62 pages, plus the 1-page hole at addr 0 that the kernel always
reserves, is 0x63 pages of holes, which aren't accounted for in the
hypervisor's target.

so the balloon driver hotplugs the memory, and tries to increase its
reservation to provide the needed pages to get the current_pages up to
the target. However, when it calls the hypervisor to populate the
physmap, the hypervisor only allows 11 (0xb) pages to be populated;
all calls after that get back 0 from the hypervisor.

Do you think the hypervisor's balloon target should account for the
e820 holes (and for the kernel's added hole at addr 0)?
Alternately/additionally, if the hypervisor doesn't want to support
ballooning, should it just return error from the call to populate the
physmap, and not allow those 11 pages?

At this point, it doesn't seem to me like the kernel is doing anything
wrong, correct?



I think there is indeed a disconnect between target memory (provided by the toolstack) and current memory (i.e actual pages available to the guest).

For example

[ 0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved

are missed in target calculation. The hvmloader marks them as RESERVED (in build_e820_table()) but target value is not aware of this action.

And then the same problem repeats when kernel removes 0x000a0000-0x000fffff chunk.

(BTW, this is all happening before the new 0x8000 pages are onlined, which takes places much later and is a separate and what looks to me an unrelated event).

-boris