Re: [PATCH v3] KVM: X86: Fix softlockup when get the current kvmclock

From: Radim KrÄmÃÅ
Date: Thu Nov 16 2017 - 13:26:00 EST


2017-11-15 09:17+0800, Wanpeng Li:
> Ping, :)

Ah, sorry, I got distracted while learning about the hotplug mechanism.
Indeed we cannot move move the callback earlier because the cpufreq
driver kvm uses on crappy hardware gets set in CPUHP_AP_ONLINE_DYN, which
is way too late.

> 2017-11-09 10:52 GMT+08:00 Wanpeng Li <kernellwp@xxxxxxxxx>:
> > From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
> >
> > watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [qemu-system-x86:10185]
> > CPU: 6 PID: 10185 Comm: qemu-system-x86 Tainted: G OE 4.14.0-rc4+ #4
> > RIP: 0010:kvm_get_time_scale+0x4e/0xa0 [kvm]
> > Call Trace:
> > ? get_kvmclock_ns+0xa3/0x140 [kvm]
> > get_time_ref_counter+0x5a/0x80 [kvm]
> > kvm_hv_process_stimers+0x120/0x5f0 [kvm]
> > ? kvm_hv_process_stimers+0x120/0x5f0 [kvm]
> > ? preempt_schedule+0x27/0x30
> > ? ___preempt_schedule+0x16/0x18
> > kvm_arch_vcpu_ioctl_run+0x4b4/0x1690 [kvm]
> > ? kvm_arch_vcpu_load+0x47/0x230 [kvm]
> > kvm_vcpu_ioctl+0x33a/0x620 [kvm]
> > ? kvm_vcpu_ioctl+0x33a/0x620 [kvm]
> > ? kvm_vm_ioctl_check_extension_generic+0x3b/0x40 [kvm]
> > ? kvm_dev_ioctl+0x279/0x6c0 [kvm]
> > do_vfs_ioctl+0xa1/0x5d0
> > ? __fget+0x73/0xa0
> > SyS_ioctl+0x79/0x90
> > entry_SYSCALL_64_fastpath+0x1e/0xa9
> >
> > This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
> > cpu-hotplug stress simultaneously. __this_cpu_read(cpu_tsc_khz) returns 0
> > (set in kvmclock_cpu_down_prep()) when the pCPU is unhotplug which results
> > in kvm_get_time_scale() gets into an infinite loop.
> >
> > This patch fixes it by treating the unhotplug pCPU as not using master clock.
> >
> > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
> > Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
> > ---
> > arch/x86/kvm/x86.c | 11 +++++++----
> > 1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 03869eb..d61dcce3 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -1795,10 +1795,13 @@ u64 get_kvmclock_ns(struct kvm *kvm)
> > /* both __this_cpu_read() and rdtsc() should be on the same cpu */
> > get_cpu();
> >
> > - kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
> > - &hv_clock.tsc_shift,
> > - &hv_clock.tsc_to_system_mul);
> > - ret = __pvclock_read_cycles(&hv_clock, rdtsc());
> > + if (__this_cpu_read(cpu_tsc_khz)) {
> > + kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,

Would be safer to read __this_cpu_read(cpu_tsc_khz) only once, but I
think it works for now as unplug thread must be scheduled and get_cpu()
prevents changes.

> > + &hv_clock.tsc_shift,
> > + &hv_clock.tsc_to_system_mul);
> > + ret = __pvclock_read_cycles(&hv_clock, rdtsc());
> > + } else
> > + ret = ktime_get_boot_ns() + ka->kvmclock_offset;

Not pretty, but gets the job done ...

Reviewed-by: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>