Re: [PATCH v3 07/26] KVM: VMX: Move preemption timer <=> hrtimer dance to common x86

From: Paolo Bonzini
Date: Wed Mar 29 2023 - 09:48:36 EST


On 3/29/23 14:34, Tudor Ambarus wrote:
This patch fixes the bug reported at:
LINK:
https://syzkaller.appspot.com/bug?id=489beb3d76ef14cc6cd18125782dc6f86051a605

One may find the strace at:
LINK:https://syzkaller.appspot.com/text?tag=CrashLog&x=1798b54ec80000
and the c reproducer at:
LINK:https://syzkaller.appspot.com/text?tag=ReproC&x=10365781c80000

Since I've no experience with kvm, it would be helpful if one of you can
provide some guidance. Do you think it is worth to backport this patch
to stable (together with its prerequisite patches), or shall I try to
get familiar with the code and try to provide a less invasive fix?

I think it is enough to fix the conflicts in vmx_pre_block and
vmx_post_block, there are no prerequisites:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 0718658268fe..895069038856 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7577,17 +7577,11 @@ static int vmx_pre_block(struct kvm_vcpu *vcpu)
if (pi_pre_block(vcpu))
return 1;
- if (kvm_lapic_hv_timer_in_use(vcpu))
- kvm_lapic_switch_to_sw_timer(vcpu);
-
return 0;
}
static void vmx_post_block(struct kvm_vcpu *vcpu)
{
- if (kvm_x86_ops.set_hv_timer)
- kvm_lapic_switch_to_hv_timer(vcpu);
-
pi_post_block(vcpu);
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fcfa3fedf84f..4eca3ec38afd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10022,12 +10022,28 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
{
+ bool hv_timer;
+
if (!kvm_arch_vcpu_runnable(vcpu) &&
(!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) {
+ /*
+ * Switch to the software timer before halt-polling/blocking as
+ * the guest's timer may be a break event for the vCPU, and the
+ * hypervisor timer runs only when the CPU is in guest mode.
+ * Switch before halt-polling so that KVM recognizes an expired
+ * timer before blocking.
+ */
+ hv_timer = kvm_lapic_hv_timer_in_use(vcpu);
+ if (hv_timer)
+ kvm_lapic_switch_to_sw_timer(vcpu);
+
srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
kvm_vcpu_block(vcpu);
vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
+ if (hv_timer)
+ kvm_lapic_switch_to_hv_timer(vcpu);
+
if (kvm_x86_ops.post_block)
static_call(kvm_x86_post_block)(vcpu);
@@ -10266,6 +10282,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
r = -EINTR;
goto out;
}
+ /*
+ * It should be impossible for the hypervisor timer to be in
+ * use before KVM has ever run the vCPU.
+ */
+ WARN_ON_ONCE(kvm_lapic_hv_timer_in_use(vcpu));
kvm_vcpu_block(vcpu);
if (kvm_apic_accept_events(vcpu) < 0) {
r = 0;

The fix is due to the second "if" changing from
kvm_x86_ops.set_hv_timer to hv_timer.

Paolo