From: David Woodhouse <dwmw@xxxxxxxxxxxx>
Most of the time there's no need to kick the vCPU and deliver the timer
event through kvm_xen_inject_timer_irqs(). Use kvm_xen_set_evtchn_fast()
directly from the timer callback, and only fall back to the slow path
when it's necessary to do so.
This gives a significant improvement in timer latency testing (using
nanosleep() for various periods and then measuring the actual time
elapsed).
However, there was a reason¹ the fast path was dropped when this support
was first added. The current code holds vcpu->mutex for all operations
on the kvm->arch.timer_expires field, and the fast path introduces a
potential race condition. Avoid that race by ensuring the hrtimer is
(temporarily) cancelled before making changes in kvm_xen_start_timer(),
and also when reading the values out for KVM_XEN_VCPU_ATTR_TYPE_TIMER.
¹ https://lore.kernel.org/kvm/846caa99-2e42-4443-1070-84e49d2f11d2@xxxxxxxxxx/
Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
---
• v2: Remember, and deal with, those races.
• v3: Drop the assertions for vcpu being loaded; those can be done
separately if at all.
Reorder the code in xen_timer_callback() to make it clearer
that kvm->arch.xen.timer_expires is being cleared in the case
where the event channel delivery is *complete*, as opposed to
the -EWOULDBLOCK deferred path.
Drop the 'pending' variable in kvm_xen_vcpu_get_attr() and
restart the hrtimer if (kvm->arch.xen.timer_expires), which
ought to be exactly the same thing (that's the *point* in
cancelling the timer, to make it truthful as we return its
value to userspace).
Improve comments.
arch/x86/kvm/xen.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)