Re: [PATCH V4 3/4] KVM: x86: Add a capability to configure bus frequency for APIC timer
From: Edgecombe, Rick P
Date: Tue Apr 16 2024 - 13:08:59 EST
On Thu, 2024-03-21 at 09:37 -0700, Reinette Chatre wrote:
> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
>
> Add KVM_CAP_X86_APIC_BUS_FREQUENCY capability to configure the APIC
> bus clock frequency for APIC timer emulation.
> Allow KVM_ENABLE_CAPABILITY(KVM_CAP_X86_APIC_BUS_FREQUENCY) to set the
> frequency in nanoseconds. When using this capability, the user space
> VMM should configure CPUID leaf 0x15 to advertise the frequency.
>
> Vishal reported that the TDX guest kernel expects a 25MHz APIC bus
> frequency but ends up getting interrupts at a significantly higher rate.
>
> The TDX architecture hard-codes the core crystal clock frequency to
> 25MHz and mandates exposing it via CPUID leaf 0x15. The TDX architecture
> does not allow the VMM to override the value.
>
> In addition, per Intel SDM:
> "The APIC timer frequency will be the processor’s bus clock or core
> crystal clock frequency (when TSC/core crystal clock ratio is
> enumerated in CPUID leaf 0x15) divided by the value specified in
> the divide configuration register."
>
> The resulting 25MHz APIC bus frequency conflicts with the KVM hardcoded
> APIC bus frequency of 1GHz.
>
> The KVM doesn't enumerate CPUID leaf 0x15 to the guest unless the user
> space VMM sets it using KVM_SET_CPUID. If the CPUID leaf 0x15 is
> enumerated, the guest kernel uses it as the APIC bus frequency. If not,
> the guest kernel measures the frequency based on other known timers like
> the ACPI timer or the legacy PIT. As reported by Vishal the TDX guest
> kernel expects a 25MHz timer frequency but gets timer interrupt more
> frequently due to the 1GHz frequency used by KVM.
>
> To ensure that the guest doesn't have a conflicting view of the APIC bus
> frequency, allow the userspace to tell KVM to use the same frequency that
> TDX mandates instead of the default 1Ghz.
>
> There are several options to address this:
> 1. Make the KVM able to configure APIC bus frequency (this series).
> Pro: It resembles the existing hardware. The recent Intel CPUs
> adapts 25MHz.
> Con: Require the VMM to emulate the APIC timer at 25MHz.
> 2. Make the TDX architecture enumerate CPUID leaf 0x15 to configurable
> frequency or not enumerate it.
> Pro: Any APIC bus frequency is allowed.
> Con: Deviates from TDX architecture.
> 3. Make the TDX guest kernel use 1GHz when it's running on KVM.
> Con: The kernel ignores CPUID leaf 0x15.
> 4. Change CPUID leaf 0x15 under TDX to report the crystal clock frequency
> as 1 GHz.
> Pro: This has been the virtual APIC frequency for KVM guests for 13
> years.
> Pro: This requires changing only one hard-coded constant in TDX.
> Con: It doesn't work with other VMMs as TDX isn't specific to KVM.
> Con: Core crystal clock frequency is also used to calculate TSC
> frequency.
> Con: If it is configured to value different from hardware, it will
> break the correctness of INTEL-PT Mini Time Count (MTC) packets
> in TDs.
>
> Reported-by: Vishal Annapurve <vannapurve@xxxxxxxxxx>
> Closes:
> https://lore.kernel.org/lkml/20231006011255.4163884-1-vannapurve@xxxxxxxxxx/
Is Closes appropriate, given the issue Vishal hit was on non-upstream code?
> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> Co-developed-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> Signed-off-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>