Re: [PATCH 2/3] KVM: dynamise halt_poll_ns adjustment

From: Wanpeng Li
Date: Mon Aug 24 2015 - 18:34:32 EST

Next message: Wanpeng Li: "Re: [PATCH 1/3] KVM: make halt_poll_ns per-VCPU"
Previous message: Dave Chinner: "Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME inodes"
In reply to: David Matlack: "Re: [PATCH 2/3] KVM: dynamise halt_poll_ns adjustment"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Davidï
On 8/25/15 1:00 AM, David Matlack wrote:

On Mon, Aug 24, 2015 at 5:53 AM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> wrote:

There are two new kernel parameters for changing the halt_poll_ns:
halt_poll_ns_grow and halt_poll_ns_shrink. halt_poll_ns_grow affects
halt_poll_ns when an interrupt arrives and halt_poll_ns_shrink
does it when idle VCPU is detected.

halt_poll_ns_shrink/ |
halt_poll_ns_grow | interrupt arrives | idle VCPU is detected
---------------------+----------------------+-------------------
< 1 | = halt_poll_ns | = 0
< halt_poll_ns | *= halt_poll_ns_grow | /= halt_poll_ns_shrink
otherwise | += halt_poll_ns_grow | -= halt_poll_ns_shrink

A third new parameter, halt_poll_ns_max, controls the maximal halt_poll_ns;
it is internally rounded down to a closest multiple of halt_poll_ns_grow.

I like the idea of growing and shrinking halt_poll_ns, but I'm not sure
we grow and shrink in the right places here. For example, if vcpu->halt_poll_ns
gets down to 0, I don't see how it can then grow back up.

This has already done in __grow_halt_poll_ns().

This might work better:
if (poll successfully for interrupt): stay the same
else if (length of kvm_vcpu_block is longer than halt_poll_ns_max): shrink
else if (length of kvm_vcpu_block is less than halt_poll_ns_max): grow

where halt_poll_ns_max is something reasonable, like 2 millisecond.

You get diminishing returns from halt polling as the length of the
halt gets longer (halt polling only reduces halt latency by 10-15 us).
So there's little benefit to polling longer than a few milliseconds.

Great idea, David! Thanks for your review and I will implement it in next version. :-)

Regards,
Wanpeng Li

Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
---
virt/kvm/kvm_main.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 80 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index a122b52..bcfbd35 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -66,9 +66,28 @@
MODULE_AUTHOR("Qumranet");
MODULE_LICENSE("GPL");

-static unsigned int halt_poll_ns;
+#define KVM_HALT_POLL_NS 500000
+#define KVM_HALT_POLL_NS_GROW 2
+#define KVM_HALT_POLL_NS_SHRINK 0
+#define KVM_HALT_POLL_NS_MAX \
+ INT_MAX / KVM_HALT_POLL_NS_GROW
+
+static unsigned int halt_poll_ns = KVM_HALT_POLL_NS;
module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR);

+/* Default doubles per-vcpu halt_poll_ns. */
+static int halt_poll_ns_grow = KVM_HALT_POLL_NS_GROW;
+module_param(halt_poll_ns_grow, int, S_IRUGO);
+
+/* Default resets per-vcpu halt_poll_ns . */
+int halt_poll_ns_shrink = KVM_HALT_POLL_NS_SHRINK;
+module_param(halt_poll_ns_shrink, int, S_IRUGO);
+
+/* Default is to compute the maximum so we can never overflow. */
+unsigned int halt_poll_ns_actual_max = KVM_HALT_POLL_NS_MAX;
+unsigned int halt_poll_ns_max = KVM_HALT_POLL_NS_MAX;
+module_param(halt_poll_ns_max, int, S_IRUGO);
+
/*
* Ordering of locks:
*
@@ -1907,6 +1926,62 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn)
}
EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty);

+static unsigned int __grow_halt_poll_ns(unsigned int val)
+{
+ if (halt_poll_ns_grow < 1)
+ return halt_poll_ns;
+
+ val = min(val, halt_poll_ns_actual_max);
+
+ if (val == 0)
+ return halt_poll_ns;
+
+ if (halt_poll_ns_grow < halt_poll_ns)
+ val *= halt_poll_ns_grow;
+ else
+ val += halt_poll_ns_grow;
+
+ return val;
+}
+
+static unsigned int __shrink_halt_poll_ns(int val, int modifier, int minimum)
+{
+ if (modifier < 1)
+ return 0;
+
+ if (modifier < halt_poll_ns)
+ val /= modifier;
+ else
+ val -= modifier;
+
+ return val;
+}
+
+static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
+{
+ vcpu->halt_poll_ns = __grow_halt_poll_ns(vcpu->halt_poll_ns);
+}
+
+static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu)
+{
+ vcpu->halt_poll_ns = __shrink_halt_poll_ns(vcpu->halt_poll_ns,
+ halt_poll_ns_shrink, halt_poll_ns);
+}
+
+/*
+ * halt_poll_ns_actual_max is computed to be one grow_halt_poll_ns() below
+ * halt_poll_ns_max. (See __grow_halt_poll_ns for the reason.)
+ * This prevents overflows, because ple_halt_poll_ns is int.
+ * halt_poll_ns_max effectively rounded down to a multiple of halt_poll_ns_grow in
+ * this process.
+ */
+static void update_halt_poll_ns_actual_max(void)
+{
+ halt_poll_ns_actual_max =
+ __shrink_halt_poll_ns(max(halt_poll_ns_max, halt_poll_ns),
+ halt_poll_ns_grow, INT_MIN);
+}
+
static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
{
if (kvm_arch_vcpu_runnable(vcpu)) {
@@ -1941,6 +2016,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
*/
if (kvm_vcpu_check_block(vcpu) < 0) {
++vcpu->stat.halt_successful_poll;
+ grow_halt_poll_ns(vcpu);
goto out;
}
cur = ktime_get();
@@ -1954,6 +2030,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
break;

waited = true;
+ if (halt_poll_ns)
+ shrink_halt_poll_ns(vcpu);
schedule();
}

@@ -2950,6 +3028,7 @@ static void hardware_enable_nolock(void *junk)
cpumask_set_cpu(cpu, cpus_hardware_enabled);

r = kvm_arch_hardware_enable();
+ update_halt_poll_ns_actual_max();

if (r) {
cpumask_clear_cpu(cpu, cpus_hardware_enabled);
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Wanpeng Li: "Re: [PATCH 1/3] KVM: make halt_poll_ns per-VCPU"
Previous message: Dave Chinner: "Re: [PATCH block/for-linus] writeback: fix syncing of I_DIRTY_TIME inodes"
In reply to: David Matlack: "Re: [PATCH 2/3] KVM: dynamise halt_poll_ns adjustment"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]