Re: [PATCH v3 02/41] x86/tsc: Add helper to register CPU and TSC freq calibration routines

From: David Woodhouse

Date: Wed May 20 2026 - 17:45:11 EST


On Wed, 2026-05-20 at 14:33 -0700, Sean Christopherson wrote:
> On Wed, May 20, 2026, David Woodhouse wrote:
> > On Wed, 2026-05-20 at 13:44 -0700, Sean Christopherson wrote:
> > >
> > > +       /*
> > > +        * If the TSC counts at a constant frequency across P/T states, counts
> > > +        * in deep C-states, and the TSC hasn't been marked unstable, treat the
> > > +        * TSC reliable, as guaranteed by KVM.  Note, the TSC unstable check
> > > +        * exists purely to honor the TSC being marked unstable via command
> > > +        * line, any runtime detection of an unstable will happen after this.
> > > +        */
> > > +       if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> > > +           boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> > > +           !check_tsc_unstable())
> >     {
> > > +               tsc_properties = TSC_FREQ_KNOWN_AND_RELIABLE;
> >
> >     kvmclock = 0; /* Why use it if the TSC works? The kvmclock exists
> >                      *purely* to work around a TSC which *doesn't*
> >                      have those properties checked above. */
>
> kvmclock still provides SYSTEM_TIME and WALL_CLOCK :-/

Um, SYSTEM_TIME *is* the kvmclock of which we speak, isn't it?

WALL_CLOCK is something else, and I guess we should still consume that,
yes.

>
> > > +
> > > +       kvm_tsc_khz_cpuid = kvm_para_tsc_khz();
> > > +
> > > +       /*
> > > +        * If provided, use the TSC (and APIC bus) frequency provided in KVM's
> > > +        * PV CPUID leaf even if kvmclock itself is disabled via command line.
> > > +        * The PV CPUID information isn't dependent on kvmclock in any way, and
> > > +        * in fact using the precise information is *more* important when the
> > > +        * user has explicitly disabled kvmclock to force the kernel to use the
> > > +        * TSC as its clocksource.
> > > +        */
> > > +       if (!kvmclock) {
> > > +               if (kvm_tsc_khz_cpuid)
> > > +                       tsc_register_calibration_routines(kvm_get_tsc_khz,
> > > +                                                         kvm_get_cpu_khz,
> > > +                                                         tsc_properties);
> > > +               return;
> > > +       }
> > > +
> >
> >
> > Regardless of the above, why not just register these here
> > unconditionally, and remove the later call that does the same?
>
> Because if kvmclock=n, it's only safe to call kvm_get_tsc_khz() if kvm_tsc_khz_cpuid
> is non-zero, other wise the "else" path will hit a NULL pointer deref when trying
> to get the frequency from the PV clock struct:
>
> return kvm_tsc_khz_cpuid ? : pvclock_tsc_khz(this_cpu_pvti());

Ah, right. Thanks. Ick though :)

Attachment: smime.p7s
Description: S/MIME cryptographic signature