Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl

From: Steve Rutherford
Date: Thu Feb 25 2021 - 18:00:54 EST


On Thu, Feb 25, 2021 at 12:20 PM Ashish Kalra <ashish.kalra@xxxxxxx> wrote:
>
> On Wed, Feb 24, 2021 at 10:22:33AM -0800, Sean Christopherson wrote:
> > On Wed, Feb 24, 2021, Ashish Kalra wrote:
> > > # Samples: 19K of event 'kvm:kvm_hypercall'
> > > # Event count (approx.): 19573
> > > #
> > > # Overhead Command Shared Object Symbol
> > > # ........ ............... ................ .........................
> > > #
> > > 100.00% qemu-system-x86 [kernel.vmlinux] [k] kvm_emulate_hypercall
> > >
> > > Out of these 19573 hypercalls, # of page encryption status hcalls are 19479,
> > > so almost all hypercalls here are page encryption status hypercalls.
> >
> > Oof.
> >
> > > The above data indicates that there will be ~2% more Heavyweight VMEXITs
> > > during SEV guest boot if we do page encryption status hypercalls
> > > pass-through to host userspace.
> > >
> > > But, then Brijesh pointed out to me and highlighted that currently
> > > OVMF is doing lot of VMEXITs because they don't use the DMA pool to minimize the C-bit toggles,
> > > in other words, OVMF bounce buffer does page state change on every DMA allocate and free.
> > >
> > > So here is the performance analysis after kernel and initrd have been
> > > loaded into memory using grub and then starting perf just before booting the kernel.
> > >
> > > These are the performance #'s after kernel and initrd have been loaded into memory,
> > > then perf is attached and kernel is booted :
> > >
> > > # Samples: 1M of event 'kvm:kvm_userspace_exit'
> > > # Event count (approx.): 1081235
> > > #
> > > # Overhead Trace output
> > > # ........ ........................
> > > #
> > > 99.77% reason KVM_EXIT_IO (2)
> > > 0.23% reason KVM_EXIT_MMIO (6)
> > >
> > > # Samples: 1K of event 'kvm:kvm_hypercall'
> > > # Event count (approx.): 1279
> > > #
> > >
> > > So as the above data indicates, Linux is only making ~1K hypercalls,
> > > compared to ~18K hypercalls made by OVMF in the above use case.
> > >
> > > Does the above adds a prerequisite that OVMF needs to be optimized if
> > > and before hypercall pass-through can be done ?
> >
> > Disclaimer: my math could be totally wrong.
> >
> > I doubt it's a hard requirement. Assuming a conversative roundtrip time of 50k
> > cycles, those 18K hypercalls will add well under a 1/2 a second of boot time.
> > If userspace can push the roundtrip time down to 10k cycles, the overhead is
> > more like 50 milliseconds.
> >
> > That being said, this does seem like a good OVMF cleanup, irrespective of this
> > new hypercall. I assume it's not cheap to convert a page between encrypted and
> > decrypted.
> >
> > Thanks much for getting the numbers!
>
> Considering the above data and guest boot time latencies
> (and potential issues with OVMF and optimizations required there),
> do we have any consensus on whether we want to do page encryption
> status hypercall passthrough or not ?
>
> Thanks,
> Ashish

Thanks for grabbing the data!

I am fine with both paths. Sean has stated an explicit desire for
hypercall exiting, so I think that would be the current consensus.

If we want to do hypercall exiting, this should be in a follow-up
series where we implement something more generic, e.g. a hypercall
exiting bitmap or hypercall exit list. If we are taking the hypercall
exit route, we can drop the kvm side of the hypercall. Userspace could
also handle the MSR using MSR filters (would need to confirm that).
Then userspace could also be in control of the cpuid bit.

Essentially, I think you could drop most of the host kernel work if
there were generic support for hypercall exiting. Then userspace would
be responsible for all of that. Thoughts on this?

--Steve