Re: [PATCH v3] KVM: x86: Use fast path for Xen timer delivery

From: Sean Christopherson
Date: Tue Feb 06 2024 - 21:58:41 EST


On Tue, Feb 06, 2024, David Woodhouse wrote:
> On Tue, 2024-02-06 at 10:41 -0800, Sean Christopherson wrote:
> >
> > This has an obvious-in-hindsight recursive deadlock bug.  If KVM actually needs
> > to inject a timer IRQ, and the fast path fails, i.e. the gpc is invalid,
> > kvm_xen_set_evtchn() will attempt to acquire xen.xen_lock, which is already held
>
> Hm, right. In fact, kvm_xen_set_evtchn() shouldn't actually *need* the
> xen_lock in an ideal world; it's only taking it in order to work around
> the fact that the gfn_to_pfn_cache doesn't have its *own* self-
> sufficient locking. I have patches for that...
>
> I think the *simplest* of the "patches for that" approaches is just to
> use the gpc->refresh_lock to cover all activate, refresh and deactivate
> calls. I was waiting for Paul's series to land before sending that one,
> but I'll work on it today, and double-check my belief that we can then
> just drop xen_lock from kvm_xen_set_evtchn().

While I definitely want to get rid of arch.xen.xen_lock, I don't want to address
the deadlock by relying on adding more locking to the gpc code. I want a teeny
tiny patch that is easy to review and backport. Y'all are *proably* the only
folks that care about Xen emulation, but even so, that's not a valid reason for
taking a roundabout way to fixing a deadlock.

Can't we simply not take xen_lock in kvm_xen_vcpu_get_attr() It holds vcpu->mutex
so it's mutually exclusive with kvm_xen_vcpu_set_attr(), and I don't see any other
flows other than vCPU destruction that deactivate (or change) the gpc.

And the worst case scenario is that if _userspace_ is being stupid, userspace gets
a stale GPA.

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 4b4e738c6f1b..50aa28b9ffc4 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -973,8 +973,6 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
{
int r = -ENOENT;

- mutex_lock(&vcpu->kvm->arch.xen.xen_lock);
-
switch (data->type) {
case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO:
if (vcpu->arch.xen.vcpu_info_cache.active)
@@ -1083,7 +1081,6 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
break;
}

- mutex_unlock(&vcpu->kvm->arch.xen.xen_lock);
return r;
}



If that seems to risky, we could go with an ugly and hacky, but conservative:

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 4b4e738c6f1b..456d05c5b18a 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -1052,7 +1052,9 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
*/
if (vcpu->arch.xen.timer_expires) {
hrtimer_cancel(&vcpu->arch.xen.timer);
+ mutex_unlock(&vcpu->kvm->arch.xen.xen_lock);
kvm_xen_inject_timer_irqs(vcpu);
+ mutex_lock(&vcpu->kvm->arch.xen.xen_lock);
}

data->u.timer.port = vcpu->arch.xen.timer_virq;