Re: [PATCH] KVM: Boost vCPU candidiate in user mode which is delivering interrupt

From: Wanpeng Li
Date: Tue Apr 20 2021 - 06:27:55 EST


On Tue, 20 Apr 2021 at 18:23, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 20/04/21 10:48, Wanpeng Li wrote:
> >> I was thinking of something simpler:
> >>
> >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> >> index 9b8e30dd5b9b..455c648f9adc 100644
> >> --- a/virt/kvm/kvm_main.c
> >> +++ b/virt/kvm/kvm_main.c
> >> @@ -3198,10 +3198,9 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
> >> {
> >> struct kvm *kvm = me->kvm;
> >> struct kvm_vcpu *vcpu;
> >> - int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
> >> int yielded = 0;
> >> int try = 3;
> >> - int pass;
> >> + int pass, num_passes = 1;
> >> int i;
> >>
> >> kvm_vcpu_set_in_spin_loop(me, true);
> >> @@ -3212,13 +3211,14 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
> >> * VCPU is holding the lock that we need and will release it.
> >> * We approximate round-robin by starting at the last boosted VCPU.
> >> */
> >> - for (pass = 0; pass < 2 && !yielded && try; pass++) {
> >> - kvm_for_each_vcpu(i, vcpu, kvm) {
> >> - if (!pass && i <= last_boosted_vcpu) {
> >> - i = last_boosted_vcpu;
> >> - continue;
> >> - } else if (pass && i > last_boosted_vcpu)
> >> - break;
> >> + for (pass = 0; pass < num_passes; pass++) {
> >> + int idx = me->kvm->last_boosted_vcpu;
> >> + int n = atomic_read(&kvm->online_vcpus);
> >> + for (i = 0; i < n; i++, idx++) {
> >> + if (idx == n)
> >> + idx = 0;
> >> +
> >> + vcpu = kvm_get_vcpu(kvm, idx);
> >> if (!READ_ONCE(vcpu->ready))
> >> continue;
> >> if (vcpu == me)
> >> @@ -3226,23 +3226,36 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
> >> if (rcuwait_active(&vcpu->wait) &&
> >> !vcpu_dy_runnable(vcpu))
> >> continue;
> >> - if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
> >> - !kvm_arch_vcpu_in_kernel(vcpu))
> >> - continue;
> >> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> >> continue;
> >>
> >> + if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
> >> + !kvm_arch_vcpu_in_kernel(vcpu)) {
> >> + /*
> >> + * A vCPU running in userspace can get to kernel mode via
> >> + * an interrupt. That's a worse choice than a CPU already
> >> + * in kernel mode so only do it on a second pass.
> >> + */
> >> + if (!vcpu_dy_runnable(vcpu))
> >> + continue;
> >> + if (pass == 0) {
> >> + num_passes = 2;
> >> + continue;
> >> + }
> >> + }
> >> +
> >> yielded = kvm_vcpu_yield_to(vcpu);
> >> if (yielded > 0) {
> >> kvm->last_boosted_vcpu = i;
> >> - break;
> >> + goto done;
> >> } else if (yielded < 0) {
> >> try--;
> >> if (!try)
> >> - break;
> >> + goto done;
> >> }
> >> }
> >> }
> >> +done:
> >
> > We just tested the above post against 96 vCPUs VM in an over-subscribe
> > scenario, the score of pbzip2 fluctuated drastically. Sometimes it is
> > worse than vanilla, but the average improvement is around 2.2%. The
> > new version of my post is around 9.3%,the origial posted patch is
> > around 10% which is totally as expected since now both IPI receivers
> > in user-mode and lock-waiters are second class citizens.
>
> Fair enough. Of the two patches you posted I prefer the original, so
> I'll go with that one.

Great! Thanks. :)

Wanpeng