Re: [PATCH v2] KVM: halt-polling: poll if emulated lapic timer will fire soon

From: Yang Zhang
Date: Sun May 22 2016 - 21:26:48 EST


On 2016/5/21 2:37, David Matlack wrote:
On Thu, May 19, 2016 at 7:04 PM, Yang Zhang <yang.zhang.wz@xxxxxxxxx> wrote:
On 2016/5/20 2:36, David Matlack wrote:

On Thu, May 19, 2016 at 11:01 AM, David Matlack <dmatlack@xxxxxxxxxx>
wrote:

On Thu, May 19, 2016 at 6:27 AM, Wanpeng Li <kernellwp@xxxxxxxxx> wrote:

From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>

If an emulated lapic timer will fire soon(in the scope of 10us the
base of dynamic halt-polling, lower-end of message passing workload
latency TCP_RR's poll time < 10us) we can treat it as a short halt,
and poll to wait it fire, the fire callback apic_timer_fn() will set
KVM_REQ_PENDING_TIMER, and this flag will be check during busy poll.
This can avoid context switch overhead and the latency which we wake
up vCPU.


If I understand correctly, your patch aims to reduce the latency of
(APIC Timer expires) -> (Guest resumes execution) using halt-polling.
Let me know if I'm misunderstanding.

In general, I don't think it makes sense to poll for timer interrupts.
We know when the timer interrupt is going to arrive. If we care about
the latency of delivering that interrupt to the guest, we should
program the hrtimer to wake us up slightly early, and then deliver the
virtual timer interrupt right on time (I think KVM's TSC Deadline
Timer emulation already does this).


(It looks like the way to enable this feature is to set the module
parameter lapic_timer_advance_ns and make sure your guest is using the
TSC Deadline timer instead of the APIC Timer.)


This feature is slightly different from current advance expiration way.
Advance expiration rely on the VCPU is running(do polling before vmentry).
But in some cases, the timer interrupt may be blocked by other thread(i.e.,
IF bit is clear) and VCPU cannot be scheduled to run immediately. So even
advance the timer early, VCPU may still see the latency. But polling is
different, it ensures the VCPU to aware the timer expiration before schedule
out.


I'm curious to know if this scheme
would give the same performance improvement to iperf as your patch.

We discussed this a bit before on the mailing list before
(https://lkml.org/lkml/2016/3/29/680). I'd like to see halt-polling
and timer interrupts go in the opposite direction: if the next timer
event (from any timer) is less than vcpu->halt_poll_ns, don't poll at
all.


iperf TCP get ~6% bandwidth improvement.


Can you explain why your patch results in this bandwidth improvement?


It should be reasonable. I have seen the same improvement with ctx switch
benchmark: The latency is reduce from ~2600ns to ~2300ns with the similar
mechanism.(The same idea but different implementation)

It's not obvious to me why polling for a timer interrupt would improve
context switch latency. Can you explain a bit more?

We have a workload which using high resolution timer(less than 1ms) inside guest. It rely on the timer to wakeup itself. Sometimes the timer is expected to fired just after the VCPU is blocked due to execute halt instruction. But the thread who is running in the CPU will turn off the hardware interrupt for long time due to disk access. This will cause the timer interrupt been blocked until the interrupt is re-open.
For optimization, we let VCPU to poll for a while if the next timer will arrive soon before schedule out. And the result shows good when running several workloads inside guest.

--
best regards
yang