[PATCH v3 1/2] KVM: x86: Prevent 'hv_clock->system_time' from going negative in kvm_guest_time_update()

From: Vitaly Kuznetsov
Date: Wed Mar 31 2021 - 08:42:47 EST


When guest time is reset with KVM_SET_CLOCK(0), it is possible for
'hv_clock->system_time' to become a small negative number. This happens
because in KVM_SET_CLOCK handling we set 'kvm->arch.kvmclock_offset' based
on get_kvmclock_ns(kvm) but when KVM_REQ_CLOCK_UPDATE is handled,
kvm_guest_time_update() does (masterclock in use case):

hv_clock.system_time = ka->master_kernel_ns + v->kvm->arch.kvmclock_offset;

And 'master_kernel_ns' represents the last time when masterclock
got updated, it can precede KVM_SET_CLOCK() call. Normally, this is not a
problem, the difference is very small, e.g. I'm observing
hv_clock.system_time = -70 ns. The issue comes from the fact that
'hv_clock.system_time' is stored as unsigned and 'system_time / 100' in
compute_tsc_page_parameters() becomes a very big number.

Use 'master_kernel_ns' instead of get_kvmclock_ns() when masterclock is in
use and get_kvmclock_base_ns() when it's not to prevent 'system_time' from
going negative.

Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
---
arch/x86/kvm/x86.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2bfd00da465f..2f54beed0105 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5728,6 +5728,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
}
#endif
case KVM_SET_CLOCK: {
+ struct kvm_arch *ka = &kvm->arch;
struct kvm_clock_data user_ns;
u64 now_ns;

@@ -5746,8 +5747,22 @@ long kvm_arch_vm_ioctl(struct file *filp,
* pvclock_update_vm_gtod_copy().
*/
kvm_gen_update_masterclock(kvm);
- now_ns = get_kvmclock_ns(kvm);
- kvm->arch.kvmclock_offset += user_ns.clock - now_ns;
+
+ /*
+ * This pairs with kvm_guest_time_update(): when masterclock is
+ * in use, we use master_kernel_ns + kvmclock_offset to set
+ * unsigned 'system_time' so if we use get_kvmclock_ns() (which
+ * is slightly ahead) here we risk going negative on unsigned
+ * 'system_time' when 'user_ns.clock' is very small.
+ */
+ spin_lock_irq(&ka->pvclock_gtod_sync_lock);
+ if (kvm->arch.use_master_clock)
+ now_ns = ka->master_kernel_ns;
+ else
+ now_ns = get_kvmclock_base_ns();
+ ka->kvmclock_offset = user_ns.clock - now_ns;
+ spin_unlock_irq(&ka->pvclock_gtod_sync_lock);
+
kvm_make_all_cpus_request(kvm, KVM_REQ_CLOCK_UPDATE);
break;
}
--
2.30.2