Re: [PATCH] x86, clock: Fix kvm guest tsc initialization
From: Paolo Bonzini
Date: Thu Sep 08 2016 - 09:33:49 EST
On 08/09/2016 15:07, Prarit Bhargava wrote:
> When booting a kvm guest on AMD with the latest kernel the following
> messages are displayed in the boot log:
>
> tsc: Unable to calibrate against PIT
> tsc: HPET/PMTIMER calibration failed
>
> aa297292d708 ("x86/tsc: Enumerate SKL cpu_khz and tsc_khz via CPUID")
> introduced a change to account for a difference in cpu and tsc frequencies for
> Intel SKL processors. Before this change the native tsc set
> x86_platform.calibrate_tsc to native_calibrate_tsc() which is a hardware
> calibration of the tsc, and in tsc_init() executed
>
> tsc_khz = x86_platform.calibrate_tsc();
> cpu_khz = tsc_khz;
>
> The kvm code changed x86_platform.calibrate_tsc to kvm_get_tsc_khz() and
> executed the same tsc_init() function. This meant that KVM guests did not
> execute the native hardware calibration function.
>
> After aa297292d708, there are separate native calibrations for cpu_khz and
> tsc_khz. The code sets x86_platform.calibrate_tsc to native_calibrate_tsc()
> which is now an Intel specific calibration function , and
> x86_platform.calibrate_cpu to native_calibrate_cpu() which is the "old"
> native_calibrate_tsc() function (ie, the native hardware calibration
> function). [...]
>
> The kvm code should not call the hardware initialization in
> native_calibrate_cpu(), as it isn't applicable for kvm and it didn't do that
> prior to aa297292d708. Setting x86_platform.calibrate_cpu to NULL is not
> appropriate as cpu_khz_from_cpuid() must be called to get the correct
> value of cpu_khz on Intel KVM guests.
>
> This patch resolves this issue by setting x86_platform.calibrate_cpu to
> cpu_khz_from_cpuid() for KVM guests, which allows Intel KVM guests to get
> the right cpu frequency.
KVM guests don't have that CPUID leaf at all. kvm_get_tsc_khz can
double as x86_platform.calibrate_cpu too for KVM guests, restoring the
behavior prior to aa297292d708.
Thanks,
Paolo
> Fixes: aa297292d708 ("x86/tsc: Enumerate SKL cpu_khz and tsc_khz via CPUID")
> Signed-off-by: Prarit Bhargava <prarit@xxxxxxxxxx>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: "Radim KrÄmÃÅ" <rkrcmar@xxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> Cc: x86@xxxxxxxxxx
> Cc: Len Brown <len.brown@xxxxxxxxx>
> Cc: "Peter Zijlstra (Intel)" <peterz@xxxxxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxx>
> Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Cc: "Christopher S. Hall" <christopher.s.hall@xxxxxxxxx>
> Cc: David Woodhouse <dwmw2@xxxxxxxxxxxxx>
> Cc: kvm@xxxxxxxxxxxxxxx
> ---
> arch/x86/include/asm/tsc.h | 1 +
> arch/x86/kernel/kvmclock.c | 1 +
> arch/x86/kernel/tsc.c | 2 +-
> 3 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
> index 33b6365c22fe..1bfb3a14dad0 100644
> --- a/arch/x86/include/asm/tsc.h
> +++ b/arch/x86/include/asm/tsc.h
> @@ -37,6 +37,7 @@ extern int unsynchronized_tsc(void);
> extern int check_tsc_unstable(void);
> extern unsigned long native_calibrate_cpu(void);
> extern unsigned long native_calibrate_tsc(void);
> +extern unsigned long cpu_khz_from_cpuid(void);
> extern unsigned long long native_sched_clock_from_tsc(u64 tsc);
>
> extern int tsc_clocksource_reliable;
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 1d39bfbd26bb..1fe23cff7c3e 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -289,6 +289,7 @@ void __init kvmclock_init(void)
> put_cpu();
>
> x86_platform.calibrate_tsc = kvm_get_tsc_khz;
> + x86_platform.calibrate_cpu = cpu_khz_from_cpuid;
> x86_platform.get_wallclock = kvm_get_wallclock;
> x86_platform.set_wallclock = kvm_set_wallclock;
> #ifdef CONFIG_X86_LOCAL_APIC
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 78b9cb5a26af..9265ea8effe9 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -699,7 +699,7 @@ unsigned long native_calibrate_tsc(void)
> return crystal_khz * ebx_numerator / eax_denominator;
> }
>
> -static unsigned long cpu_khz_from_cpuid(void)
> +unsigned long cpu_khz_from_cpuid(void)
> {
> unsigned int eax_base_mhz, ebx_max_mhz, ecx_bus_mhz, edx;
>
>