Re: [PATCH 2/2] KVM: selftests: Add KVM/PV clock selftest to prove timer drift correction

From: David Woodhouse
Date: Thu Apr 11 2024 - 09:29:04 EST


On Wed, 2024-04-10 at 10:15 +0000, Allister, Jack wrote:
> > AFAIR, I copied check_clocksource() from existing code during that >
> time.
>
> > The commit e440c5f2e ("KVM: selftests: Generalize check_clocksource()
> > from kvm_clock_test") has introduced sys_clocksource_is_tsc(). Later
> > it is renamed to sys_clocksource_is_based_on_tsc().
> > Any chance to re-use sys_clocksource_is_based_on_tsc()?
>
> Yes I'm more than happy to change it to that. I was using your original
> mail as a reference and did not realise there was a utility present for
> this.
>
> > Is configure_scaled_tsc() anecessary? Or how about to make it an  >
> option/arg?
> > Then I will be able to test it on a VM/server without TSC scaling.
>
> So if TSC scaling from 3GHz (host) -> 1.5GHz (guest) I do see a skew of
> ~3500ns after the update. Where as without scaling a delta can be seen
> but is roughly ~180ns.

I don't think it's as simple as "TSC scaling makes the drift larger".
I suspect that's just the way the arithmetic precision works out for
those frequencies. With other frequencies of host and guest you might
find that it works out closer *with* the scaling.

Consider a graph of "time" in the Y axis, against the host TSC as the X
axis. As an example, let's assume the host has a TSC frequency of 3GHz.

Each of the three definitions of the KVM clock (A based on
CLOCK_MONOTONIC_RAW, B based on the guest TSC, C based directly on the
host TSC) will have a gradient of *roughly* 1 ns per three ticks.

Due to arithmetic precision, the gradient of each is going to vary
slightly. We hope that CLOCK_MONOTONIC_RAW is going to do the best, as
the other two are limited by the precision of the pvclock ABI that's
exposed to the guest. You can use http://david.woodhou.se/tsdrift.c to
see where the latter two land, for different TSC frequencies.

$ ./tsdrift 2500000000 3000000000 | tail -1
TSC 259200000000000, guest TSC 215999999979883, guest ns 86399999971836 host ns 86399999979883 (delta -8047)
$ ./tsdrift 2700000000 3000000000 | tail -1
TSC 259200000000000, guest TSC 233279999975860, guest ns 86399999983012 host ns 86399999979883 (delta 3129)

So after a day, let's assume CLOCK_MONOTONIC_RAW will have advanced by
86400 seconds. The KVM clock based on the host TSC will be 20µs slow,
while a KVM clock based on a guest TSC frequency of 2.5GHz would be an
*additional* 8µs slower. But a guest TSC frequency of 2.7GHz would
actually run *faster* than the host-based one, and would only be 17µs
behind reality.

Your test is measuring how *much* the host CLOCK_MONOTONIC_RAW (my
definition A) drifts from definition B which is derived from the guest
TSC.

It demonstrates the discontinuity that KVM_REQ_MASTERCLOCK_UPDATE
introduces, by clamping the KVM clock back to the 'definition A' line.

Fixing that is in the TODO list I shared. Basically it involves
realising that in use_master_clock mode, the delta between the KVM
clock and CLOCK_MONOTONIC_RAW (ka->kvmclock_offset) is *varying* over
time. So instead of just blindly using kvmclock_offset, we should
*recalculate* it in precisely the way that your KVM_SET_CLOCK_GUEST
does.

Having said all that... scaling from 3GHz to 1.5GHz *doesn't* lose any
precision; it shouldn't make any difference. But I guess your host TSC
isn't *really* 3GHz, it's measured against the PIT or something awful,
and comes out at a shade above or below 3GHz, leading to a more
interesting scaling factor?

> In V2 I've adjusted the test so that now by default scaling won't take
> place, however if someone wants to test with it enabled they can pass
> "-s/--scale-tsc" to induce the greater delta.

Please do it automatically based on the availability of the feature.

Attachment: smime.p7s
Description: S/MIME cryptographic signature