Re: [PATCH v7] sched/clock: Avoid false sharing for sched_clock_irqtime
From: Shrikanth Hegde
Date: Wed Jan 28 2026 - 02:57:26 EST
On 1/28/26 1:20 PM, K Prateek Nayak wrote:
On 1/28/2026 1:02 PM, Shrikanth Hegde wrote:
On 1/28/26 12:48 PM, K Prateek Nayak wrote:
On 1/28/2026 11:56 AM, Shrikanth Hegde wrote:
On 1/28/26 8:35 AM, K Prateek Nayak wrote:
On 1/28/2026 7:49 AM, Guo, Wangyang wrote:
Yes, when clock mark unstable through tsc_.*mark_unstable() with non-native_sched_clock, clear_sched_clock_stable won't be called, thus sched_clock_irqtime still keep enabled.
Maybe the dedicated workqueue for sched_clock_irqtime is still needed considering this case.
In that case, shouldn't tsc_init() only enable irqtime when
using_native_sched_clock()? How can tsc_init() make a call on irqtime if
TSC isn't being used as the sched_clock() ultimately?
For kvmclock, if PVCLOCK_TSC_STABLE_BIT is not set, it'll call
clear_sched_clock_stable() at kvm_sched_clock_init() but none of the
other clocksources do so we can assume once we override the sched_clock()
it is up to the sched_clock() provider to deal with the clock stability.
I think this would depend if mark_tsc_unstable happens after system boot,
specially while running kvm guest?
I don't see anything on the guest side that would mark the kvmclock as
unstable if host's TSC turns unstable post init and since kvmclock
doesn't set CLOCK_SOURCE_MUST_VERIFY, I doubt if a watchdog runs to
verify it in the guest.
I have the following in the guest:
$ sudo dmesg | grep -i clock
[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: using sched offset of 423259259 cycles
This means pv_sched_clock is kvm_sched_clock_read from now. and
irqtime is enabled in the guest. right?
So within the guest today ...
$ sudo dmesg | grep -i "clock\|tsc"
[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: using sched offset of 504626078 cycles
# kvm_sched_clock_init() happens here so it can potentially do
# clear_sched_clock_stable() here if !PVCLOCK_TSC_STABLE_BIT.
[ 0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[ 0.000004] tsc: Detected 1996.251 MHz processor
# We enable irqtime here once TSC frequency has been determined
# without considering using_native_sched_clock()
After that TSC is never selected so we don't care if it is stable
or not since it is not the clocksource - the guest continues on
with unstable sched_clock() but also irqtime enabled since TSC
was calibrated successfully.
[ 0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[ 0.071675] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[ 0.378467] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
[ 0.388678] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cb1e4d56, max_idle_ns: 881590790753 ns
[ 0.679262] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 0.903121] PTP clock support registered
[ 0.927243] clocksource: Switched to clocksource kvm-clock
[ 0.944986] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[ 0.993198] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cb1e4d56, max_idle_ns: 881590790753 ns
[ 1.123796] rtc_cmos 00:05: setting system clock to 2026-01-28T07:03:45 UTC (1769583825)
[ 1.155755] sched_clock: Marking stable (940009972, 212965288)->(1171254846, -18279586)
[ 1.712598] clk: Disabling unused clocks
Then I mark TSC unstable on the host
tsc: Marking TSC unstable due to Faking unreliable TSC!
TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
clocksource: Checking clocksource tsc synchronization from CPU 93 to CPUs 0,2,26,75,101,114,118,195.
sched_clock: Marking unstable (945948313746, 69389667)<-(947618130068, -1600430832)
clocksource: CPU 93 check durations 3436ns - 25277ns for clocksource tsc.
clocksource: Switched to clocksource hpet
so now, using_native_sched_clock should fail in guest? If so, with the patch,
irqtime won't be disabled no?
Ideally yes, but the guest continues using kvmclock without any hitch.
I think the x86 KVM layer has something to ensure stability but I'm
not 100% sure.
Since I don't see "tsc: Marking TSC unstable ..." or "sched_clock:
Marking unstable ..." in the guest, we don't hit the mark_tsc_unstable()
path within the guest which would disable irqtime today so essentially
host's TSC turning changing doesn't seem to affect the guest.
Okay. Fair enough.
Then v7 should cover all scenarios i think. with that,
Reviewed-by: Shrikanth Hegde <sshegde@xxxxxxxxxxxxx>