Re: [PATCH v2 2/3] KVM: dynamic halt_poll_ns adjustment

From: David Matlack
Date: Thu Aug 27 2015 - 12:26:15 EST


On Thu, Aug 27, 2015 at 2:59 AM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx> wrote:
> Hi David,
> On 8/26/15 1:19 AM, David Matlack wrote:
>>
>> Thanks for writing v2, Wanpeng.
>>
>> On Mon, Aug 24, 2015 at 11:35 PM, Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>> wrote:
>>>
>>> There is a downside of halt_poll_ns since poll is still happen for idle
>>> VCPU which can waste cpu usage. This patch adds the ability to adjust
>>> halt_poll_ns dynamically.
>>
>> What testing have you done with these patches? Do you know if this removes
>> the overhead of polling in idle VCPUs? Do we lose any of the performance
>> from always polling?
>>
>>> There are two new kernel parameters for changing the halt_poll_ns:
>>> halt_poll_ns_grow and halt_poll_ns_shrink. A third new parameter,
>>> halt_poll_ns_max, controls the maximal halt_poll_ns; it is internally
>>> rounded down to a closest multiple of halt_poll_ns_grow. The shrink/grow
>>> matrix is suggested by David:
>>>
>>> if (poll successfully for interrupt): stay the same
>>> else if (length of kvm_vcpu_block is longer than halt_poll_ns_max):
>>> shrink
>>> else if (length of kvm_vcpu_block is less than halt_poll_ns_max): grow
>>
>> The way you implemented this wasn't what I expected. I thought you would
>> time
>> the whole function (kvm_vcpu_block). But I like your approach better. It's
>> simpler and [by inspection] does what we want.
>
>
> I see there is more idle vCPUs overhead w/ this method even more than always
> halt-poll, so I bring back grow vcpu->halt_poll_ns when interrupt arrives
> and shrinks when idle VCPU is detected. The perfomance looks good in v4.

Why did this patch have a worse idle overhead than always poll?

>
> Regards,
> Wanpeng Li
>
>
>>
>>> halt_poll_ns_shrink/ |
>>> halt_poll_ns_grow | grow halt_poll_ns | shrink halt_poll_ns
>>> ---------------------+----------------------+-------------------
>>> < 1 | = halt_poll_ns | = 0
>>> < halt_poll_ns | *= halt_poll_ns_grow | /= halt_poll_ns_shrink
>>> otherwise | += halt_poll_ns_grow | -= halt_poll_ns_shrink
>>
>> I was curious why you went with this approach rather than just the
>> middle row, or just the last row. Do you think we'll want the extra
>> flexibility?
>>
>>> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>> ---
>>> virt/kvm/kvm_main.c | 65
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>> 1 file changed, 64 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>> index 93db833..2a4962b 100644
>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -66,9 +66,26 @@
>>> MODULE_AUTHOR("Qumranet");
>>> MODULE_LICENSE("GPL");
>>>
>>> -static unsigned int halt_poll_ns;
>>> +#define KVM_HALT_POLL_NS 500000
>>> +#define KVM_HALT_POLL_NS_GROW 2
>>> +#define KVM_HALT_POLL_NS_SHRINK 0
>>> +#define KVM_HALT_POLL_NS_MAX 2000000
>>
>> The macros are not necessary. Also, hard coding the numbers in the param
>> definitions will make reading the comments above them easier.
>>
>>> +
>>> +static unsigned int halt_poll_ns = KVM_HALT_POLL_NS;
>>> module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR);
>>>
>>> +/* Default doubles per-vcpu halt_poll_ns. */
>>> +static unsigned int halt_poll_ns_grow = KVM_HALT_POLL_NS_GROW;
>>> +module_param(halt_poll_ns_grow, int, S_IRUGO);
>>> +
>>> +/* Default resets per-vcpu halt_poll_ns . */
>>> +static unsigned int halt_poll_ns_shrink = KVM_HALT_POLL_NS_SHRINK;
>>> +module_param(halt_poll_ns_shrink, int, S_IRUGO);
>>> +
>>> +/* halt polling only reduces halt latency by 10-15 us, 2ms is enough */
>>
>> Ah, I misspoke before. I was thinking about round-trip latency. The
>> latency
>> of a single halt is reduced by about 5-7 us.
>>
>>> +static unsigned int halt_poll_ns_max = KVM_HALT_POLL_NS_MAX;
>>> +module_param(halt_poll_ns_max, int, S_IRUGO);
>>
>> We can remove halt_poll_ns_max. vcpu->halt_poll_ns can always start at
>> zero
>> and grow from there. Then we just need one module param to keep
>> vcpu->halt_poll_ns from growing too large.
>>
>> [ It would make more sense to remove halt_poll_ns and keep
>> halt_poll_ns_max,
>> but since halt_poll_ns already exists in upstream kernels, we probably
>> can't
>> remove it. ]
>>
>>> +
>>> /*
>>> * Ordering of locks:
>>> *
>>> @@ -1907,6 +1924,48 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu
>>> *vcpu, gfn_t gfn)
>>> }
>>> EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty);
>>>
>>> +static unsigned int __grow_halt_poll_ns(unsigned int val)
>>> +{
>>> + if (halt_poll_ns_grow < 1)
>>> + return halt_poll_ns;
>>> +
>>> + val = min(val, halt_poll_ns_max);
>>> +
>>> + if (val == 0)
>>> + return halt_poll_ns;
>>> +
>>> + if (halt_poll_ns_grow < halt_poll_ns)
>>> + val *= halt_poll_ns_grow;
>>> + else
>>> + val += halt_poll_ns_grow;
>>> +
>>> + return val;
>>> +}
>>> +
>>> +static unsigned int __shrink_halt_poll_ns(int val, int modifier, int
>>> minimum)
>>
>> minimum never gets used.
>>
>>> +{
>>> + if (modifier < 1)
>>> + return 0;
>>> +
>>> + if (modifier < halt_poll_ns)
>>> + val /= modifier;
>>> + else
>>> + val -= modifier;
>>> +
>>> + return val;
>>> +}
>>> +
>>> +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
>>
>> These wrappers aren't necessary.
>>
>>> +{
>>> + vcpu->halt_poll_ns = __grow_halt_poll_ns(vcpu->halt_poll_ns);
>>> +}
>>> +
>>> +static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu)
>>> +{
>>> + vcpu->halt_poll_ns = __shrink_halt_poll_ns(vcpu->halt_poll_ns,
>>> + halt_poll_ns_shrink, halt_poll_ns);
>>> +}
>>> +
>>> static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
>>> {
>>> if (kvm_arch_vcpu_runnable(vcpu)) {
>>> @@ -1954,6 +2013,10 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>>> break;
>>>
>>> waited = true;
>>> + if (vcpu->halt_poll_ns > halt_poll_ns_max)
>>> + shrink_halt_poll_ns(vcpu);
>>> + else
>>> + grow_halt_poll_ns(vcpu);
>>
>> Shouldn't this go after the loop, and before "out:", in case we schedule
>> more than once? You can gate it on "if (waited)" so it only runs if we
>> actually scheduled.
>>
>>> schedule();
>>> }
>>>
>>> --
>>> 1.9.1
>>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/