Re: [PATCH v4 00/10] Create a userfaultfd demand paging test
From: Paolo Bonzini
Date: Fri Jan 24 2020 - 04:03:32 EST
On 23/01/20 19:04, Ben Gardon wrote:
> When handling page faults for many vCPUs during demand paging, KVM's MMU
> lock becomes highly contended. This series creates a test with a naive
> userfaultfd based demand paging implementation to demonstrate that
> contention. This test serves both as a functional test of userfaultfd
> and a microbenchmark of demand paging performance with a variable number
> of vCPUs and memory per vCPU.
>
> The test creates N userfaultfd threads, N vCPUs, and a region of memory
> with M pages per vCPU. The N userfaultfd polling threads are each set up
> to serve faults on a region of memory corresponding to one of the vCPUs.
> Each of the vCPUs is then started, and touches each page of its disjoint
> memory region, sequentially. In response to faults, the userfaultfd
> threads copy a static buffer into the guest's memory. This creates a
> worst case for MMU lock contention as we have removed most of the
> contention between the userfaultfd threads and there is no time required
> to fetch the contents of guest memory.
>
> This test was run successfully on Intel Haswell, Broadwell, and
> Cascadelake hosts with a variety of vCPU counts and memory sizes.
>
> This test was adapted from the dirty_log_test.
>
> The series can also be viewed in Gerrit here:
> https://linux-review.googlesource.com/c/virt/kvm/kvm/+/1464
> (Thanks to Dmitry Vyukov <dvyukov@xxxxxxxxxx> for setting up the Gerrit
> instance)
>
> v4 (Responding to feedback from Andrew Jones, Peter Xu, and Peter Shier):
> - Tested this revision by running
> demand_paging_test
> at each commit in the series on an Intel Haswell machine. Ran
> demand_paging_test -u -v 8 -b 8M -d 10
> on the same machine at the last commit in the series.
> - Readded partial aarch64 support, though aarch64 and s390 remain
> untested
> - Implemented pipefd polling to reduce UFFD thread exit latency
> - Added variable unit input for memory size so users can pass command
> line arguments of the form -b 24M instead of the raw number or bytes
> - Moved a missing break from a patch later in the series to an earlier
> one
> - Moved to syncing per-vCPU global variables to guest and looking up
> per-vcpu arguments based on a single CPU ID passed to each guest
> vCPU. This allows for future patches to pass more than the supported
> number of arguments for each arch to the vCPUs.
> - Implemented vcpu_args_set for s390 and aarch64 [UNTESTED]
> - Changed vm_create to always allocate memslot 0 at 4G instead of only
> when the number of pages required is large.
> - Changed vcpu_wss to vcpu_memory_size for clarity.
>
> Ben Gardon (10):
> KVM: selftests: Create a demand paging test
> KVM: selftests: Add demand paging content to the demand paging test
> KVM: selftests: Add configurable demand paging delay
> KVM: selftests: Add memory size parameter to the demand paging test
> KVM: selftests: Pass args to vCPU in global vCPU args struct
> KVM: selftests: Add support for vcpu_args_set to aarch64 and s390x
> KVM: selftests: Support multiple vCPUs in demand paging test
> KVM: selftests: Time guest demand paging
> KVM: selftests: Stop memslot creation in KVM internal memslot region
> KVM: selftests: Move memslot 0 above KVM internal memslots
>
> tools/testing/selftests/kvm/.gitignore | 1 +
> tools/testing/selftests/kvm/Makefile | 5 +-
> .../selftests/kvm/demand_paging_test.c | 680 ++++++++++++++++++
> .../testing/selftests/kvm/include/test_util.h | 2 +
> .../selftests/kvm/lib/aarch64/processor.c | 33 +
> tools/testing/selftests/kvm/lib/kvm_util.c | 27 +-
> .../selftests/kvm/lib/s390x/processor.c | 35 +
> tools/testing/selftests/kvm/lib/test_util.c | 61 ++
> 8 files changed, 839 insertions(+), 5 deletions(-)
> create mode 100644 tools/testing/selftests/kvm/demand_paging_test.c
> create mode 100644 tools/testing/selftests/kvm/lib/test_util.c
>
Queued patches 1-9, thanks.
Paolo