[BUG] XEN/PV dom0 time management

From: Thomas Gleixner
Date: Mon Aug 07 2023 - 05:19:48 EST


Hi!

Something in XEN/PV time management seems to be seriously broken:

timekeeping watchdog on CPU9: Marking clocksource 'tsc' as unstable because the skew is too large:
[ 152.557154] clocksource: 'xen' wd_nsec: 511979417 wd_now: 24e4d7625e wd_last: 24c65332c5 mask: ffffffffffffffff
[ 152.566197] clocksource: 'tsc' cs_nsec: 512468734 cs_now: 9a306c9b808c cs_last: 9a302c9e30ba mask: ffffffffffffffff
[ 152.572319] clocksource: Clocksource 'tsc' skewed 489317 ns (0 ms) over watchdog 'xen' interval of 511979417 ns (511 ms)
[ 152.578067] clocksource: 'tsc' is current clocksource.
[ 152.581023] tsc: Marking TSC unstable due to clocksource watchdog
[ 152.583751] clocksource: Checking clocksource tsc synchronization from CPU 5 to CPUs 0,3,8,10,12,15.
[ 152.590860] clocksource: CPUs 8 ahead of CPU 5 for clocksource tsc.
[ 152.597196] clocksource: CPU 5 check durations 14197ns - 124761ns for clocksource tsc.
[ 152.602675] clocksource: Switched to clocksource xen

This is fully reproducible with variations of the failure report in the
following setup:

- VM running on KVM on a SKLX machine

- Debian bookworm install with XEN 4.17

- Happens with the off the shelf debian 6.1 kernel and with current
upstream (6.5-rc4)

Why am I convinced that this is a XENPV issue?

Simply because the same kernels booted w/o XEN on the same VM and the
same hardware do not have any issue with using TSC as clocksource. The
TSC on that machine is stable and fully synchronized. The clocksource
watchdog uses kvm-clock to monitor TSC and it never had any complaints.

But with XEN underneath its a matter of minutes after boot to happen. I
tried to make sense out of it, but ran out of steam and patience, so I
decided to report this to the XEN wizards.

Thanks,

tglx