Re: [PATCH v4 0/3] KVM: Dynamic Halt-Polling

From: Wanpeng Li
Date: Tue Sep 01 2015 - 18:58:49 EST

On 9/2/15 6:34 AM, David Matlack wrote:
On Tue, Sep 1, 2015 at 3:30 PM, Wanpeng Li <> wrote:
On 9/2/15 5:45 AM, David Matlack wrote:
On Thu, Aug 27, 2015 at 2:47 AM, Wanpeng Li <>
v3 -> v4:
* bring back grow vcpu->halt_poll_ns when interrupt arrives and shrinks
when idle VCPU is detected

v2 -> v3:
* grow/shrink vcpu->halt_poll_ns by *halt_poll_ns_grow or
* drop the macros and hard coding the numbers in the param definitions
* update the comments "5-7 us"
* remove halt_poll_ns_max and use halt_poll_ns as the max halt_poll_ns
vcpu->halt_poll_ns start at zero
* drop the wrappers
* move the grow/shrink logic before "out:" w/ "if (waited)"
I posted a patchset which adds dynamic poll toggling (on/off switch). I
this gives you a good place to build your dynamic growth patch on top. The
toggling patch has close to zero overhead for idle VMs and equivalent
performance VMs doing message passing as always-poll. It's a patch that's
in my queue for a few weeks but just haven't had the time to send out. We
win even more with your patchset by only polling as much as we need (via
dynamic growth/shrink). It also gives us a better place to stand for
a default for halt_poll_ns. (We can run experiments and see how high
vcpu->halt_poll_ns tends to grow.)

The reason I posted a separate patch for toggling is because it adds
to kvm_vcpu_block and deals with a weird edge case (kvm_vcpu_block can get
called multiple times for one halt). To do dynamic poll adjustment

Why this can happen?

we have to time the length of each halt. Otherwise we hit some bad edge

v3: v3 had lots of idle overhead. It's because vcpu->halt_poll_ns grew
time we had a long halt. So idle VMs looked like: 0 us -> 500 us -> 1
ms ->
2 ms -> 4 ms -> 0 us. Ideally vcpu->halt_poll_ns should just stay at 0
the halts are long.

v4: v4 fixed the idle overhead problem but broke dynamic growth for
passing VMs. Every time a VM did a short halt, vcpu->halt_poll_ns would
That means vcpu->halt_poll_ns will always be maxed out, even when the
time is much less than the max.

I think we can fix both edge cases if we make grow/shrink decisions based
the length of kvm_vcpu_block rather than the arrival of a guest interrupt
during polling.

Some thoughts for dynamic growth:
* Given Windows 10 timer tick (1 ms), let's set the maximum poll time
less than 1ms. 200 us has been a good value for always-poll. We can
probably go a bit higher once we have your patch. Maybe 500 us?

Did you test your patch against a windows guest?

* The base case of dynamic growth (the first grow() after being at 0)
be small. 500 us is too big. When I run TCP_RR in my guest I see poll
of < 10 us. TCP_RR is on the lower-end of message passing workload
so 10 us would be a good base case.

How to get your TCP_RR benchmark?

Wanpeng Li
Install the netperf package, or build from here:

In the vm:

# ./netserver
# ./netperf -t TCP_RR

Be sure to use an SMP guest (we want TCP_RR to be a cross-core message
passing workload in order to test halt-polling).

Ah, ok, I use the same benchmark as yours.

Wanpeng Li

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at