Re: [PATCH RFC 0/2] KVM: x86: Support using the VMX preemption timer for APIC Timer periodic/oneshot mode

From: Paolo Bonzini
Date: Tue Jul 11 2017 - 03:43:23 EST

On 11/07/2017 02:13, Andy Lutomirski wrote:
> On 10/11/2016 05:17 AM, Wanpeng Li wrote:
>> Most windows guests which I have on hand currently still utilize APIC
>> Timer
>> periodic/oneshot mode instead of APIC Timer tsc-deadline mode:
>> - windows 2008 server r2
>> - windows 2012 server r2
>> - windows 7
>> - windows 10
>> This patchset adds the support using the VMX preemption timer for APIC
>> Timer
>> periodic/oneshot mode.
>> I add a print in oneshot mode testcase of kvm-unit-tests/apic.flat and
>> observed
>> that w/ patch the latency is ~2% of w/o patch. I think maybe something
>> is still
>> not right in the patchset, in addition, tmcct in apic_get_tmcct()
>> maybe is not
>> calculated correctly. Your comments to improve the patchset is a great
>> appreciated.
>> Wanpeng Li (2):
>> KVM: lapic: Extract start_sw_period() to handle oneshot/periodic mode
>> KVM: x86: Support using the vmx preemption timer for APIC Timer
>> periodic/one mode
>> arch/x86/kvm/lapic.c | 162
>> ++++++++++++++++++++++++++++++---------------------
>> 1 file changed, 95 insertions(+), 67 deletions(-)
> I think this is a step in the right direction, but I think there's a
> different approach that would be much, much faster: use the VMX
> preemption timer for *host* preemption. Specifically, do this:
> 1. Refactor the host TSC deadline timer a bit to allow the TSC deadline
> timer to be "borrow". It might look something like this:
> u64 borrow_tsc_deadline(void (*timer_callback)());
> The caller is now permitted to use the TSC deadline timer for its own
> nefarious purposes. The caller promises to call return_tsc_deadline()
> in a timely manner if the TSC exceeds the return value while the
> deadline timer is borrowed.
> If the TSC deadline fires while it's borrowed, timer_callback() will be
> called.
> void return_tsc_deadline(bool timer_fired);
> The caller is done borrowing the TSC deadline timer. The caller need
> not reset the TSC deadline timer MSR to its previous value before
> calling this. It must be called with IRQs on and preemption off.
> Getting this to work cleanly without races may be a bit tricky. So be it.
> 2. Teach KVM to use the VMX preemption timer as a substitute deadline
> timer while in guest mode. Specifically, KVM will borrow_tsc_deadline()
> (if TSC deadline is enabled) when entering guest mode and
> return_tsc_deadline() when returning out of guest mode.
> 3. Now KVM can change its MSR bitmaps to allow the guest to program the
> TSC deadline MSR directly. No exit at all needed to handle guest writes
> to the deadline timer.

This assumes that the TSC deadline MSR observes the guest TSC offset,
which I'm not at all sure of. If you can't, you break live migration.

Also, while it would halve the cost of a guest's programming of the
timer tick, you would still incur the cost of a vmexit to call
timer_callback (it would be different if you could program the TSC
deadline timer to send a posted interrupt, of course). Things would be
half as slow, but still a far cry from bare metal.

Really, we should just ask Intel to virtualize the TSC deadline MSR.