Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

From: Ashish Kalra
Date: Wed Feb 24 2021 - 12:53:08 EST


On Thu, Feb 18, 2021 at 12:32:47PM -0600, Kalra, Ashish wrote:
> [AMD Public Use]
>
>
> -----Original Message-----
> From: Sean Christopherson <seanjc@xxxxxxxxxx>
> Sent: Tuesday, February 16, 2021 7:03 PM
> To: Kalra, Ashish <Ashish.Kalra@xxxxxxx>
> Cc: pbonzini@xxxxxxxxxx; tglx@xxxxxxxxxxxxx; mingo@xxxxxxxxxx; hpa@xxxxxxxxx; rkrcmar@xxxxxxxxxx; joro@xxxxxxxxxx; bp@xxxxxxx; Lendacky, Thomas <Thomas.Lendacky@xxxxxxx>; x86@xxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; srutherford@xxxxxxxxxx; venu.busireddy@xxxxxxxxxx; Singh, Brijesh <brijesh.singh@xxxxxxx>
> Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl
>
> On Thu, Feb 04, 2021, Ashish Kalra wrote:
> > From: Brijesh Singh <brijesh.singh@xxxxxxx>
> >
> > The ioctl is used to retrieve a guest's shared pages list.
>
> >What's the performance hit to boot time if KVM_HC_PAGE_ENC_STATUS is passed through to userspace? That way, userspace could manage the set of pages >in whatever data structure they want, and these get/set ioctls go away.
>
> I will be more concerned about performance hit during guest DMA I/O if the page encryption status hypercalls are passed through to user-space,
> a lot of guest DMA I/O dynamically sets up pages for encryption and then flips them at DMA completion, so guest I/O will surely take a performance
> hit with this pass-through stuff.
>

Here are some rough performance numbers comparing # of heavy-weight VMEXITs compared to # of hypercalls,
during a SEV guest boot: (launch of a ubuntu 18.04 guest)

# ./perf record -e kvm:kvm_userspace_exit -e kvm:kvm_hypercall -a ./qemu-system-x86_64 -enable-kvm -cpu host -machine q35 -smp 16,maxcpus=64 -m 512M -drive if=pflash,format=raw,unit=0,file=/home/ashish/sev-migration/qemu-5.1.50/OVMF_CODE.fd,readonly -drive if=pflash,format=raw,unit=1,file=OVMF_VARS.fd -drive file=../ubuntu-18.04.qcow2,if=none,id=disk0,format=qcow2 -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -device scsi-hd,drive=disk0 -object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1,policy=0x0 -machine memory-encryption=sev0 -trace events=/tmp/events -nographic -monitor pty -monitor unix:monitor-source,server,nowait -qmp unix:/tmp/qmp-sock,server,nowait -device virtio-rng-pci,disable-legacy=on,iommu_platform=true

...
...

root@diesel2540:/home/ashish/sev-migration/qemu-5.1.50# ./perf report
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 981K of event 'kvm:kvm_userspace_exit'
# Event count (approx.): 981021
#
# Overhead Command Shared Object Symbol
# ........ ............... ................ ..................
#
100.00% qemu-system-x86 [kernel.vmlinux] [k] kvm_vcpu_ioctl


# Samples: 19K of event 'kvm:kvm_hypercall'
# Event count (approx.): 19573
#
# Overhead Command Shared Object Symbol
# ........ ............... ................ .........................
#
100.00% qemu-system-x86 [kernel.vmlinux] [k] kvm_emulate_hypercall

Out of these 19573 hypercalls, # of page encryption status hcalls are 19479,
so almost all hypercalls here are page encryption status hypercalls.

The above data indicates that there will be ~2% more Heavyweight VMEXITs
during SEV guest boot if we do page encryption status hypercalls
pass-through to host userspace.

But, then Brijesh pointed out to me and highlighted that currently
OVMF is doing lot of VMEXITs because they don't use the DMA pool to minimize the C-bit toggles,
in other words, OVMF bounce buffer does page state change on every DMA allocate and free.

So here is the performance analysis after kernel and initrd have been
loaded into memory using grub and then starting perf just before booting the kernel.

These are the performance #'s after kernel and initrd have been loaded into memory,
then perf is attached and kernel is booted :

# Samples: 1M of event 'kvm:kvm_userspace_exit'
# Event count (approx.): 1081235
#
# Overhead Trace output
# ........ ........................
#
99.77% reason KVM_EXIT_IO (2)
0.23% reason KVM_EXIT_MMIO (6)

# Samples: 1K of event 'kvm:kvm_hypercall'
# Event count (approx.): 1279
#

So as the above data indicates, Linux is only making ~1K hypercalls,
compared to ~18K hypercalls made by OVMF in the above use case.

Does the above adds a prerequisite that OVMF needs to be optimized if
and before hypercall pass-through can be done ?

Thanks,
Ashish