Re: [PATCH v12 15/19] tsc: Upgrade TSC clocksource rating

From: Nikunj A. Dadhania
Date: Thu Oct 10 2024 - 02:45:05 EST




On 10/9/2024 9:46 PM, Sean Christopherson wrote:
> On Wed, Oct 09, 2024, Nikunj A Dadhania wrote:
>> In virtualized environments running on modern CPUs, the underlying
>> platforms guarantees to have a stable, always running TSC, i.e. that the
>> TSC is a superior timesource as compared to other clock sources (such as
>> kvmclock, HPET, ACPI timer, APIC, etc.).
>>
>> Upgrade the rating of the early and regular clock source to prefer TSC over
>> other clock sources when TSC is invariant, non-stop and stable.
>>
>> Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>> Signed-off-by: Nikunj A Dadhania <nikunj@xxxxxxx>
>> ---
>> arch/x86/kernel/tsc.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
>> index c83f1091bb4f..8150f2104474 100644
>> --- a/arch/x86/kernel/tsc.c
>> +++ b/arch/x86/kernel/tsc.c
>> @@ -1264,6 +1264,21 @@ static void __init check_system_tsc_reliable(void)
>> tsc_disable_clocksource_watchdog();
>> }
>>
>> +static void __init upgrade_clock_rating(struct clocksource *tsc_early,
>> + struct clocksource *tsc)
>> +{
>> + /*
>> + * Upgrade the clock rating for TSC early and regular clocksource when
>> + * the underlying platform provides non-stop, invaraint and stable TSC.
>> + */
>> + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
>> + boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
>> + !tsc_unstable) {
>
> Somewhat of a side topic, should KVM (as a hypervisor) be enumerating something
> to guests to inform them that the TSC is reliable, i.e. that X86_FEATURE_TSC_RELIABLE
> can be forced?

Xen does something similar by advertising TSC related information as part of
a CPUID leaf (Leaf 4 (0x40000x03))

> Or, should KVM (as the guest) infer X86_FEATURE_TSC_RELIABLE if
> INVARIANT_TSC is advertised by KVM (the hyperivosor)?

I am not sure about this though.

> Also, why on earth is 0x8000_0007.EDX manually scattered via x86_power?

Are you referring to CPU capabilty settings in early_init_amd() dependent
on x86_power?

>
>> + tsc_early->rating = 499;
>> + tsc->rating = 500;
>> + }
>> +}

Regards
Nikunj