selftests: kvm: allocating extra mem in slot 0 (Was: Re: [PATCH] selftests: kvm: fix overlapping addresses in memslot_perf_test)

From: Maciej S. Szmigiero
Date: Fri Jun 04 2021 - 12:49:29 EST


On 04.06.2021 05:35, Duan, Zhenzhong wrote:
-----Original Message-----
From: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
Sent: Thursday, June 3, 2021 9:06 PM
To: Andrew Jones <drjones@xxxxxxxxxx>
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
kvm@xxxxxxxxxxxxxxx; Duan, Zhenzhong <zhenzhong.duan@xxxxxxxxx>
Subject: Re: [PATCH] selftests: kvm: fix overlapping addresses in
memslot_perf_test

On 03.06.2021 14:37, Andrew Jones wrote:
On Thu, Jun 03, 2021 at 05:26:33AM +0000, Duan, Zhenzhong wrote:
-----Original Message-----
From: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
Sent: Thursday, June 3, 2021 7:07 AM
To: Paolo Bonzini <pbonzini@xxxxxxxxxx>; Duan, Zhenzhong
<zhenzhong.duan@xxxxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; Andrew Jones
<drjones@xxxxxxxxxx>
Subject: Re: [PATCH] selftests: kvm: fix overlapping addresses in
memslot_perf_test

On 30.05.2021 01:13, Maciej S. Szmigiero wrote:
On 29.05.2021 12:20, Paolo Bonzini wrote:
On 28/05/21 21:51, Maciej S. Szmigiero wrote:
On 28.05.2021 21:11, Paolo Bonzini wrote:
The memory that is allocated in vm_create is already mapped
close to GPA 0, because test_execute passes the requested memory
to prepare_vm.  This causes overlapping memory regions and the
test crashes.  For simplicity just move MEM_GPA higher.

Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>

I am not sure that I understand the issue correctly, is
vm_create_default() already reserving low GPAs (around
0x10000000) on some arches or run environments?

It maps the number of pages you pass in the second argument, see
vm_create.

   if (phy_pages != 0)
     vm_userspace_mem_region_add(vm,
VM_MEM_SRC_ANONYMOUS,
                                 0, 0, phy_pages, 0);

In this case:

   data->vm = vm_create_default(VCPU_ID, mempages, guest_code);

called here:

   if (!prepare_vm(data, nslots, maxslots, tdata->guest_code,
                   mem_size, slot_runtime)) {

where mempages is mem_size, which is declared as:

         uint64_t mem_size = tdata->mem_size ? : MEM_SIZE_PAGES;

but actually a better fix is just to pass a small fixed value (e.g.
1024) to vm_create_default, since all other regions are added by
hand

Yes, but the argument that is passed to vm_create_default()
(mem_size in the case of the test) is not passed as phy_pages to
vm_create().
Rather, vm_create_with_vcpus() calculates some upper bound of extra
memory that is needed to cover that much guest memory (including
for its page tables).

The biggest possible mem_size from memslot_perf_test is 512 MiB + 1
page, according to my calculations this results in phy_pages of
1029
(~4 MiB) in the x86-64 case and around 1540 (~6 MiB) in the s390x
case (here I am not sure about the exact number, since s390x has
some additional alignment requirements).

Both values are well below 256 MiB (0x10000000UL), so I was
wondering what kind of circumstances can make these allocations
collide (maybe I am missing something in my analysis).

I see now that there has been a patch merged last week called
"selftests: kvm: make allocation of extra memory take effect" by
Zhenzhong that now allocates also the whole memory size passed to
vm_create_default() (instead of just page tables for that much memory).

The commit message of this patch says that "perf_test_util and
kvm_page_table_test use it to alloc extra memory currently", however
both kvm_page_table_test and lib/perf_test_util framework explicitly
add the required memory allocation by doing a
vm_userspace_mem_region_add() call for the same memory size that
they pass to vm_create_default().

So now they allocate this memory twice.

@Zhenzhong: did you notice improper operation of either
kvm_page_table_test or perf_test_util-based tests
(demand_paging_test, dirty_log_perf_test,
memslot_modification_stress_test) before your patch?
No


They seem to work fine for me without the patch (and I guess other
people would have noticed earlier, too, if they were broken).

After this patch not only these tests allocate their memory twice
but it is harder to make vm_create_default() allocate the right
amount of memory for the page tables in cases where the test needs
to explicitly use
vm_userspace_mem_region_add() for its allocations (because it wants
the allocation placed at a specific GPA or in a specific memslot).

One has to basically open-code the page table size calculations from
vm_create_with_vcpus() in the particular test then, taking also into
account that vm_create_with_vcpus() will not only allocate the
passed memory size (calculated page tables size) but also behave
like it was allocating space for page tables for these page tables
(even though the passed memory size itself is supposed to cover them).
Looks we have different understanding to the parameter
extra_mem_pages of vm_create_default().

In your usage, extra_mem_pages is only used for page table
calculations, real extra memory allocation happens in the extra call of
vm_userspace_mem_region_add().

Yes, this is the meaning that kvm selftests has always had for
extra_mem_pages of vm_create_default(). If we'd rather have a
different meaning, that's fine, but we need to change all the callers
of the function as well.

If we change the meaning of extra_mem_pages (keep the patch) it would be
good to still have an additional parameter to vm_create_with_vcpus() for
tests that have to allocate their memory on their own via
vm_userspace_mem_region_add() for vm_create_with_vcpus() to just
allocate the page tables for these manual allocations.
Or a helper to calculate the required extra_mem_pages for them.

If we decide to leave vm_create_default() the way it was by reverting
this patch, then maybe we should consider renaming the parameter
and/or documenting the function.

Adding a descriptive comment (and possibly renaming the parameter) seems
like a much simpler solution to me that adapting these tests (and possibly
adding the parameter or helper described above for them).

Agree, I prefer the simpler way.

I also think of an idea for custom slot0 memory, keep extra_mem_pages the original way, adding a global slot0_pages for custom slot0 memory. Maybe not a good choice as it's not thread safe, just for discussion. That is:
1. revert "selftests: kvm: make allocation of extra memory take effect"
2. add below patch
--- a/tools/testing/selftests/kvm/include/kvm_util.h
+++ b/tools/testing/selftests/kvm/include/kvm_util.h
@@ -280,6 +280,9 @@ vm_paddr_t vm_phy_pages_alloc(struct kvm_vm *vm, size_t num,
struct kvm_vm *vm_create_default(uint32_t vcpuid, uint64_t extra_mem_pages,
void *guest_code);

+struct kvm_vm *vm_create_slot0(uint32_t vcpuid, uint64_t slot0_mem_pages,
+ uint64_t extra_mem_pages, void *guest_code);
+
/* Same as vm_create_default, but can be used for more than one vcpu */
struct kvm_vm *vm_create_default_with_vcpus(uint32_t nr_vcpus, uint64_t extra_mem_pages,
uint32_t num_percpu_pages, void *guest_code,
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 63418df921f0..56b1225865d5 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -196,6 +196,7 @@ const struct vm_guest_mode_params vm_guest_mode_params[] = {
_Static_assert(sizeof(vm_guest_mode_params)/sizeof(struct vm_guest_mode_params) == NUM_VM_MODES,
"Missing new mode params?");

+uint64_t slot0_pages = DEFAULT_GUEST_PHY_PAGES;
/*
* VM Create
*
@@ -319,8 +320,8 @@ struct kvm_vm *vm_create_with_vcpus(enum vm_guest_mode mode, uint32_t nr_vcpus,
* than N/x*2.
*/
uint64_t vcpu_pages = (DEFAULT_STACK_PGS + num_percpu_pages) * nr_vcpus;
- uint64_t extra_pg_pages = (extra_mem_pages + vcpu_pages) / PTES_PER_MIN_PAGE * 2;
- uint64_t pages = DEFAULT_GUEST_PHY_PAGES + vcpu_pages + extra_pg_pages;
+ uint64_t extra_pg_pages = (slot0_pages + extra_mem_pages + vcpu_pages) / PTES_PER_MIN_PAGE * 2;
+ uint64_t pages = slot0_pages + vcpu_pages + extra_pg_pages;
struct kvm_vm *vm;
int i;

@@ -358,9 +359,18 @@ struct kvm_vm *vm_create_default_with_vcpus(uint32_t nr_vcpus, uint64_t extra_me
num_percpu_pages, guest_code, vcpuids);
}

+struct kvm_vm *vm_create_slot0(uint32_t vcpuid, uint64_t slot0_mem_pages,
+ uint64_t extra_mem_pages, void *guest_code)
+{
+ slot0_pages = slot0_mem_pages;
+ return vm_create_default_with_vcpus(1, extra_mem_pages, 0, guest_code,
+ (uint32_t []){ vcpuid });
+}
+
struct kvm_vm *vm_create_default(uint32_t vcpuid, uint64_t extra_mem_pages,
void *guest_code)
{
+ slot0_pages = DEFAULT_GUEST_PHY_PAGES;
return vm_create_default_with_vcpus(1, extra_mem_pages, 0, guest_code,
(uint32_t []){ vcpuid });
}
@@ -626,6 +636,9 @@ void kvm_vm_free(struct kvm_vm *vmp)

/* Free the structure describing the VM. */
free(vmp);
+
+ /* Restore slot0 memory to default size for next VM creation */
+ slot0_pages = DEFAULT_GUEST_PHY_PAGES;
}

/*

In terms of thread safety a quick glance at current tests seems to
suggest that none of them create VMs from anything but their main
threads (although s90x diag318 handler for sync_regs_test does some
suspicious stuff).

But I think a better solution than adding a global variable as an implicit
parameter to vm_create_with_vcpus() is to simply add an extra explicit
parameter to this function - it has just 3 callers that need to be
(trivially) adapted then.

Thanks
Zhenzhong


Thanks,
Maciej