Re: [PATCH v2] KVM: selftests: Compare wall time from xen shinfo against KVM_GET_CLOCK

From: Sean Christopherson
Date: Tue Apr 30 2024 - 12:02:27 EST


On Mon, Apr 29, 2024, David Woodhouse wrote:
> On Mon, 2024-04-29 at 13:45 -0700, Sean Christopherson wrote:
> > On Tue, 06 Feb 2024 16:19:50 +0100, Vitaly Kuznetsov wrote:
> > > xen_shinfo_test is observed to be flaky failing sporadically with
> > > "VM time too old". With min_ts/max_ts debug print added:
> > >
> > > Wall clock (v 3269818) 1704906491.986255664
> > > Time info 1: v 1282712 tsc 33530585736 time 14014430025 mul 3587552223 shift 4294967295 flags 1
> > > Time info 2: v 1282712 tsc 33530585736 time 14014430025 mul 3587552223 shift 4294967295 flags 1
> > > min_ts: 1704906491.986312153
> > > max_ts: 1704906506.001006963
> > > ==== Test Assertion Failure ====
> > >    x86_64/xen_shinfo_test.c:1003: cmp_timespec(&min_ts, &vm_ts) <= 0
> > >    pid=32724 tid=32724 errno=4 - Interrupted system call
> > >       1        0x00000000004030ad: main at xen_shinfo_test.c:1003
> > >       2        0x00007fca6b23feaf: ?? ??:0
> > >       3        0x00007fca6b23ff5f: ?? ??:0
> > >       4        0x0000000000405e04: _start at ??:?
> > >    VM time too old
> > >
> > > [...]
> >
> > Applied to kvm-x86 selftests, thanks!
> >
> > [1/1] KVM: selftests: Compare wall time from xen shinfo against KVM_GET_CLOCK
> >       https://github.com/kvm-x86/linux/commit/201142d16010
>
> Of course, this just highlights the fact that the very *definition* of
> the wallclock time as exposed in the Xen shinfo and MSR_KVM_WALL_CLOCK
> is entirely broken now.
>
> When the KVM clock was based on CLOCK_MONOTONIC, the delta between that
> and wallclock time was constant (well, apart from leap seconds but KVM
> has *always* been utterly hosed for that, so that's just par for the
> course). So that made sense.
>
> But when we switched the KVM clock to CLOCK_MONOTONIC_RAW, trying to
> express wallclock time in terms of the KVM clock became silly. They run
> at different rates, so the value returned by kvm_get_wall_clock_epoch()
> will be constantly changing.
>
> As I work through cleaning up the KVM clock mess, it occurred to me
> that we should maybe *refresh* the wallclock time we report to the
> guest. But I think it's just been hosed for so long that no guest could
> ever trust it for anything but knowing roughly what year it is when
> first booting, and it isn't worth fixing.
>
> What we *should* do is expose something new which exposes the NTP-
> calibrated relationship between the arch counter (or TSC) and the real
> time, being explicit about TAI and about live migration (a guest needs
> to know when it's been migrated and should throw away any NTP
> refinement that it's done for *itself*).
>
> I know we have the PTP paired reading thing, but that's *still* not
> TAI, it makes guests do the work for themselves and doesn't give a
> clean signal when live migration disrupts them.

Is the above an objection to the selftest change, or just an aside? Honest
question, I have no preference either way; I just don't have my head wrapped
around all of the clock stuff enough to have an informed opinion on what the
right/best way forward is.