Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

From: Ashish Kalra
Date: Thu Feb 25 2021 - 15:21:38 EST


On Wed, Feb 24, 2021 at 10:22:33AM -0800, Sean Christopherson wrote:
> On Wed, Feb 24, 2021, Ashish Kalra wrote:
> > # Samples: 19K of event 'kvm:kvm_hypercall'
> > # Event count (approx.): 19573
> > #
> > # Overhead Command Shared Object Symbol
> > # ........ ............... ................ .........................
> > #
> > 100.00% qemu-system-x86 [kernel.vmlinux] [k] kvm_emulate_hypercall
> >
> > Out of these 19573 hypercalls, # of page encryption status hcalls are 19479,
> > so almost all hypercalls here are page encryption status hypercalls.
>
> Oof.
>
> > The above data indicates that there will be ~2% more Heavyweight VMEXITs
> > during SEV guest boot if we do page encryption status hypercalls
> > pass-through to host userspace.
> >
> > But, then Brijesh pointed out to me and highlighted that currently
> > OVMF is doing lot of VMEXITs because they don't use the DMA pool to minimize the C-bit toggles,
> > in other words, OVMF bounce buffer does page state change on every DMA allocate and free.
> >
> > So here is the performance analysis after kernel and initrd have been
> > loaded into memory using grub and then starting perf just before booting the kernel.
> >
> > These are the performance #'s after kernel and initrd have been loaded into memory,
> > then perf is attached and kernel is booted :
> >
> > # Samples: 1M of event 'kvm:kvm_userspace_exit'
> > # Event count (approx.): 1081235
> > #
> > # Overhead Trace output
> > # ........ ........................
> > #
> > 99.77% reason KVM_EXIT_IO (2)
> > 0.23% reason KVM_EXIT_MMIO (6)
> >
> > # Samples: 1K of event 'kvm:kvm_hypercall'
> > # Event count (approx.): 1279
> > #
> >
> > So as the above data indicates, Linux is only making ~1K hypercalls,
> > compared to ~18K hypercalls made by OVMF in the above use case.
> >
> > Does the above adds a prerequisite that OVMF needs to be optimized if
> > and before hypercall pass-through can be done ?
>
> Disclaimer: my math could be totally wrong.
>
> I doubt it's a hard requirement. Assuming a conversative roundtrip time of 50k
> cycles, those 18K hypercalls will add well under a 1/2 a second of boot time.
> If userspace can push the roundtrip time down to 10k cycles, the overhead is
> more like 50 milliseconds.
>
> That being said, this does seem like a good OVMF cleanup, irrespective of this
> new hypercall. I assume it's not cheap to convert a page between encrypted and
> decrypted.
>
> Thanks much for getting the numbers!

Considering the above data and guest boot time latencies
(and potential issues with OVMF and optimizations required there),
do we have any consensus on whether we want to do page encryption
status hypercall passthrough or not ?

Thanks,
Ashish