RE: selftests: kvm: allocating extra mem in slot 0 (Was: Re: [PATCH] selftests: kvm: fix overlapping addresses in memslot_perf_test)

From: Duan, Zhenzhong
Date: Mon Jun 07 2021 - 23:20:09 EST




> -----Original Message-----
> From: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
> Sent: Saturday, June 5, 2021 12:49 AM
> To: Duan, Zhenzhong <zhenzhong.duan@xxxxxxxxx>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx;
> kvm@xxxxxxxxxxxxxxx; Andrew Jones <drjones@xxxxxxxxxx>
> Subject: selftests: kvm: allocating extra mem in slot 0 (Was: Re: [PATCH]
> selftests: kvm: fix overlapping addresses in memslot_perf_test)
>
> On 04.06.2021 05:35, Duan, Zhenzhong wrote:
> >> -----Original Message-----
> >> From: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
> >> Sent: Thursday, June 3, 2021 9:06 PM
> >> To: Andrew Jones <drjones@xxxxxxxxxx>
> >> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>;
> >> linux-kernel@xxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; Duan, Zhenzhong
> >> <zhenzhong.duan@xxxxxxxxx>
> >> Subject: Re: [PATCH] selftests: kvm: fix overlapping addresses in
> >> memslot_perf_test
> >>
> >> On 03.06.2021 14:37, Andrew Jones wrote:
> >>> On Thu, Jun 03, 2021 at 05:26:33AM +0000, Duan, Zhenzhong wrote:
> >>>>> -----Original Message-----
> >>>>> From: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
> >>>>> Sent: Thursday, June 3, 2021 7:07 AM
> >>>>> To: Paolo Bonzini <pbonzini@xxxxxxxxxx>; Duan, Zhenzhong
> >>>>> <zhenzhong.duan@xxxxxxxxx>
> >>>>> Cc: linux-kernel@xxxxxxxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; Andrew
> >>>>> Jones <drjones@xxxxxxxxxx>
> >>>>> Subject: Re: [PATCH] selftests: kvm: fix overlapping addresses in
> >>>>> memslot_perf_test
> >>>>>
> >>>>> On 30.05.2021 01:13, Maciej S. Szmigiero wrote:
> >>>>>> On 29.05.2021 12:20, Paolo Bonzini wrote:
> >>>>>>> On 28/05/21 21:51, Maciej S. Szmigiero wrote:
> >>>>>>>> On 28.05.2021 21:11, Paolo Bonzini wrote:
> >>>>>>>>> The memory that is allocated in vm_create is already mapped
> >>>>>>>>> close to GPA 0, because test_execute passes the requested
> >>>>>>>>> memory to prepare_vm.  This causes overlapping memory
> regions
> >>>>>>>>> and the test crashes.  For simplicity just move MEM_GPA higher.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> >>>>>>>>
> >>>>>>>> I am not sure that I understand the issue correctly, is
> >>>>>>>> vm_create_default() already reserving low GPAs (around
> >>>>>>>> 0x10000000) on some arches or run environments?
> >>>>>>>
> >>>>>>> It maps the number of pages you pass in the second argument, see
> >>>>>>> vm_create.
> >>>>>>>
> >>>>>>>    if (phy_pages != 0)
> >>>>>>>      vm_userspace_mem_region_add(vm,
> >> VM_MEM_SRC_ANONYMOUS,
> >>>>>>>                                  0, 0, phy_pages, 0);
> >>>>>>>
> >>>>>>> In this case:
> >>>>>>>
> >>>>>>>    data->vm = vm_create_default(VCPU_ID, mempages,
> >>>>>>> guest_code);
> >>>>>>>
> >>>>>>> called here:
> >>>>>>>
> >>>>>>>    if (!prepare_vm(data, nslots, maxslots, tdata->guest_code,
> >>>>>>>                    mem_size, slot_runtime)) {
> >>>>>>>
> >>>>>>> where mempages is mem_size, which is declared as:
> >>>>>>>
> >>>>>>>          uint64_t mem_size = tdata->mem_size ? :
> >>>>>>> MEM_SIZE_PAGES;
> >>>>>>>
> >>>>>>> but actually a better fix is just to pass a small fixed value (e.g.
> >>>>>>> 1024) to vm_create_default, since all other regions are added by
> >>>>>>> hand
> >>>>>>
> >>>>>> Yes, but the argument that is passed to vm_create_default()
> >>>>>> (mem_size in the case of the test) is not passed as phy_pages to
> >> vm_create().
> >>>>>> Rather, vm_create_with_vcpus() calculates some upper bound of
> >>>>>> extra memory that is needed to cover that much guest memory
> >>>>>> (including for its page tables).
> >>>>>>
> >>>>>> The biggest possible mem_size from memslot_perf_test is 512 MiB +
> >>>>>> 1 page, according to my calculations this results in phy_pages of
> >>>>>> 1029
> >>>>>> (~4 MiB) in the x86-64 case and around 1540 (~6 MiB) in the s390x
> >>>>>> case (here I am not sure about the exact number, since s390x has
> >>>>>> some additional alignment requirements).
> >>>>>>
> >>>>>> Both values are well below 256 MiB (0x10000000UL), so I was
> >>>>>> wondering what kind of circumstances can make these allocations
> >>>>>> collide (maybe I am missing something in my analysis).
> >>>>>
> >>>>> I see now that there has been a patch merged last week called
> >>>>> "selftests: kvm: make allocation of extra memory take effect" by
> >>>>> Zhenzhong that now allocates also the whole memory size passed to
> >>>>> vm_create_default() (instead of just page tables for that much
> memory).
> >>>>>
> >>>>> The commit message of this patch says that "perf_test_util and
> >>>>> kvm_page_table_test use it to alloc extra memory currently",
> >>>>> however both kvm_page_table_test and lib/perf_test_util framework
> >>>>> explicitly add the required memory allocation by doing a
> >>>>> vm_userspace_mem_region_add() call for the same memory size that
> >> they pass to vm_create_default().
> >>>>>
> >>>>> So now they allocate this memory twice.
> >>>>>
> >>>>> @Zhenzhong: did you notice improper operation of either
> >>>>> kvm_page_table_test or perf_test_util-based tests
> >>>>> (demand_paging_test, dirty_log_perf_test,
> >>>>> memslot_modification_stress_test) before your patch?
> >>>> No
> >>>>
> >>>>>
> >>>>> They seem to work fine for me without the patch (and I guess other
> >>>>> people would have noticed earlier, too, if they were broken).
> >>>>>
> >>>>> After this patch not only these tests allocate their memory twice
> >>>>> but it is harder to make vm_create_default() allocate the right
> >>>>> amount of memory for the page tables in cases where the test needs
> >>>>> to explicitly use
> >>>>> vm_userspace_mem_region_add() for its allocations (because it
> >>>>> wants the allocation placed at a specific GPA or in a specific memslot).
> >>>>>
> >>>>> One has to basically open-code the page table size calculations
> >>>>> from
> >>>>> vm_create_with_vcpus() in the particular test then, taking also
> >>>>> into account that vm_create_with_vcpus() will not only allocate
> >>>>> the passed memory size (calculated page tables size) but also
> >>>>> behave like it was allocating space for page tables for these page
> >>>>> tables (even though the passed memory size itself is supposed to
> cover them).
> >>>> Looks we have different understanding to the parameter
> >> extra_mem_pages of vm_create_default().
> >>>>
> >>>> In your usage, extra_mem_pages is only used for page table
> >>>> calculations, real extra memory allocation happens in the extra
> >>>> call of
> >> vm_userspace_mem_region_add().
> >>>
> >>> Yes, this is the meaning that kvm selftests has always had for
> >>> extra_mem_pages of vm_create_default(). If we'd rather have a
> >>> different meaning, that's fine, but we need to change all the
> >>> callers of the function as well.
> >>
> >> If we change the meaning of extra_mem_pages (keep the patch) it would
> >> be good to still have an additional parameter to
> >> vm_create_with_vcpus() for tests that have to allocate their memory
> >> on their own via
> >> vm_userspace_mem_region_add() for vm_create_with_vcpus() to just
> >> allocate the page tables for these manual allocations.
> >> Or a helper to calculate the required extra_mem_pages for them.
> >>
> >>> If we decide to leave vm_create_default() the way it was by
> >>> reverting this patch, then maybe we should consider renaming the
> >>> parameter and/or documenting the function.
> >>
> >> Adding a descriptive comment (and possibly renaming the parameter)
> >> seems like a much simpler solution to me that adapting these tests
> >> (and possibly adding the parameter or helper described above for them).
> >
> > Agree, I prefer the simpler way.
> >
> > I also think of an idea for custom slot0 memory, keep extra_mem_pages
> the original way, adding a global slot0_pages for custom slot0 memory.
> Maybe not a good choice as it's not thread safe, just for discussion. That is:
> > 1. revert "selftests: kvm: make allocation of extra memory take effect"
> > 2. add below patch
> > --- a/tools/testing/selftests/kvm/include/kvm_util.h
> > +++ b/tools/testing/selftests/kvm/include/kvm_util.h
> > @@ -280,6 +280,9 @@ vm_paddr_t vm_phy_pages_alloc(struct kvm_vm
> *vm, size_t num,
> > struct kvm_vm *vm_create_default(uint32_t vcpuid, uint64_t
> extra_mem_pages,
> > void *guest_code);
> >
> > +struct kvm_vm *vm_create_slot0(uint32_t vcpuid, uint64_t
> slot0_mem_pages,
> > + uint64_t extra_mem_pages, void
> > +*guest_code);
> > +
> > /* Same as vm_create_default, but can be used for more than one vcpu */
> > struct kvm_vm *vm_create_default_with_vcpus(uint32_t nr_vcpus,
> uint64_t extra_mem_pages,
> > uint32_t
> > num_percpu_pages, void *guest_code, diff --git
> > a/tools/testing/selftests/kvm/lib/kvm_util.c
> > b/tools/testing/selftests/kvm/lib/kvm_util.c
> > index 63418df921f0..56b1225865d5 100644
> > --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> > +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> > @@ -196,6 +196,7 @@ const struct vm_guest_mode_params
> vm_guest_mode_params[] = {
> > _Static_assert(sizeof(vm_guest_mode_params)/sizeof(struct
> vm_guest_mode_params) == NUM_VM_MODES,
> > "Missing new mode params?");
> >
> > +uint64_t slot0_pages = DEFAULT_GUEST_PHY_PAGES;
> > /*
> > * VM Create
> > *
> > @@ -319,8 +320,8 @@ struct kvm_vm *vm_create_with_vcpus(enum
> vm_guest_mode mode, uint32_t nr_vcpus,
> > * than N/x*2.
> > */
> > uint64_t vcpu_pages = (DEFAULT_STACK_PGS + num_percpu_pages) *
> nr_vcpus;
> > - uint64_t extra_pg_pages = (extra_mem_pages + vcpu_pages) /
> PTES_PER_MIN_PAGE * 2;
> > - uint64_t pages = DEFAULT_GUEST_PHY_PAGES + vcpu_pages +
> extra_pg_pages;
> > + uint64_t extra_pg_pages = (slot0_pages + extra_mem_pages +
> vcpu_pages) / PTES_PER_MIN_PAGE * 2;
> > + uint64_t pages = slot0_pages + vcpu_pages + extra_pg_pages;
> > struct kvm_vm *vm;
> > int i;
> >
> > @@ -358,9 +359,18 @@ struct kvm_vm
> *vm_create_default_with_vcpus(uint32_t nr_vcpus, uint64_t extra_me
> > num_percpu_pages, guest_code, vcpuids);
> > }
> >
> > +struct kvm_vm *vm_create_slot0(uint32_t vcpuid, uint64_t
> slot0_mem_pages,
> > + uint64_t extra_mem_pages, void
> > +*guest_code) {
> > + slot0_pages = slot0_mem_pages;
> > + return vm_create_default_with_vcpus(1, extra_mem_pages, 0,
> guest_code,
> > + (uint32_t []){ vcpuid });
> > +}
> > +
> > struct kvm_vm *vm_create_default(uint32_t vcpuid, uint64_t
> extra_mem_pages,
> > void *guest_code)
> > {
> > + slot0_pages = DEFAULT_GUEST_PHY_PAGES;
> > return vm_create_default_with_vcpus(1, extra_mem_pages, 0,
> guest_code,
> > (uint32_t []){ vcpuid });
> > }
> > @@ -626,6 +636,9 @@ void kvm_vm_free(struct kvm_vm *vmp)
> >
> > /* Free the structure describing the VM. */
> > free(vmp);
> > +
> > + /* Restore slot0 memory to default size for next VM creation */
> > + slot0_pages = DEFAULT_GUEST_PHY_PAGES;
> > }
> >
> > /*
>
> In terms of thread safety a quick glance at current tests seems to suggest that
> none of them create VMs from anything but their main threads (although
> s90x diag318 handler for sync_regs_test does some suspicious stuff).
>
> But I think a better solution than adding a global variable as an implicit
> parameter to vm_create_with_vcpus() is to simply add an extra explicit
> parameter to this function - it has just 3 callers that need to be
> (trivially) adapted then.

So we don't provide custom slot0 memory size support in vm_create_default() but vm_create_with_vcpus() as it has only 3 callers, that's a good idea, and I see there is
no test requiring custom slot0 support until now. Let me work out a small patchset to do all the suggested changes in this mail thread.

Regards
Zhenzhong