Re: [PATCH v1 0/3] kvm:x86: simplify kvmclock update logic

From: Paolo Bonzini
Date: Tue Nov 04 2025 - 11:36:38 EST


On 10/15/25 01:59, Sean Christopherson wrote:
On Tue, Aug 19, 2025, Lei Chen wrote:
This patch series simplifies kvmclock updating logic by reverting
related commits.

Now we have three requests about time updating:

1. KVM_REQ_CLOCK_UPDATE:
The function kvm_guest_time_update gathers info from master clock
or host.rdtsc() and update vcpu->arch.hvclock, and then kvmclock or hyperv
reference counter.

2. KVM_REQ_MASTERCLOCK_UPDATE:
The function kvm_update_masterclock updates kvm->arch from
pvclock_gtod_data(a global var updated by timekeeping subsystem), and
then make KVM_REQ_CLOCK_UPDATE request for each vcpu.

3. KVM_REQ_GLOBAL_CLOCK_UPDATE:
The function kvm_gen_kvmclock_update makes KVM_REQ_CLOCK_UPDATE
request for each vcpu.

In the early implementation, functions mentioned above were
synchronous. But things got complicated since the following commits.

1. Commit 7e44e4495a39 ("x86: kvm: rate-limit global clock updates")
intends to use kvmclock_update_work to sync ntp corretion
across all vcpus kvmclock, which is based on commit 0061d53daf26f
("KVM: x86: limit difference between kvmclock updates")


2. Commit 332967a3eac0 ("x86: kvm: introduce periodic global clock
updates") introduced a 300s-interval work to periodically sync
ntp corrections across all vcpus.

I think those commits could be reverted because:
1. Since commit 53fafdbb8b21 ("KVM: x86: switch KVMCLOCK base to
monotonic raw clock"), kvmclock switched to mono raw clock,
Those two commits could be reverted.

2. the periodic work introduced from commit 332967a3eac0 ("x86:
kvm: introduce periodic global clock updates") always does
nothing for normal scenarios. If some exceptions happen,
the corresponding logic makes right CLOCK_UPDATE request for right vcpus.
The following shows what exceptions might happen and how they are
handled.
(1). cpu_tsc_khz changed
__kvmclock_cpufreq_notifier makes KVM_REQ_CLOCK_UPDATE request
(2). use/unuse master clock
kvm_track_tsc_matching makes KVM_REQ_MASTERCLOCK_UPDATE, which means
KVM_REQ_CLOCK_UPDATE for each vcpu.
(3). guest writes MSR_IA32_TSC
kvm_synchronize_tsc will handle it and finally call
kvm_track_tsc_matching to make everything well.
(4). enable/disable tsc_catchup
kvm_arch_vcpu_load and bottom half of vcpu_enter_guest makes
KVM_REQ_CLOCK_UPDATE request

Really happy for your comments, thanks.

Related links:
https://lkml.indiana.edu/hypermail/linux/kernel/2310.0/04217.html
https://patchew.org/linux/20240522001817.619072-1-dwmw2@xxxxxxxxxxxxx/20240522001817.619072-20-dwmw2@xxxxxxxxxxxxx/

I would love, love, *love* to kill of this code, and the justification looks sane
to me, but I am genuinely not knowledgeable enough in this area to judge whether
or not this is correct/desirable going forward.

Yes, Lei Chen's reasoning are correct. In particular for the cases that trigger KVM_REQ_CLOCK_UPDATE, they really should do it immediately and not rely on periodic updates!

Paolo