Re: [PATCH v7] sched/clock: Avoid false sharing for sched_clock_irqtime

From: Shrikanth Hegde

Date: Wed Jan 28 2026 - 02:57:26 EST




On 1/28/26 1:20 PM, K Prateek Nayak wrote:
On 1/28/2026 1:02 PM, Shrikanth Hegde wrote:


On 1/28/26 12:48 PM, K Prateek Nayak wrote:
On 1/28/2026 11:56 AM, Shrikanth Hegde wrote:


On 1/28/26 8:35 AM, K Prateek Nayak wrote:
On 1/28/2026 7:49 AM, Guo, Wangyang wrote:
Yes, when clock mark unstable through tsc_.*mark_unstable() with non-native_sched_clock, clear_sched_clock_stable won't be called, thus sched_clock_irqtime still keep enabled.

Maybe the dedicated workqueue for sched_clock_irqtime is still needed considering this case.

In that case, shouldn't tsc_init() only enable irqtime when
using_native_sched_clock()? How can tsc_init() make a call on irqtime if
TSC isn't being used as the sched_clock() ultimately?

For kvmclock, if PVCLOCK_TSC_STABLE_BIT is not set, it'll call
clear_sched_clock_stable() at kvm_sched_clock_init() but none of the
other clocksources do so we can assume once we override the sched_clock()
it is up to the sched_clock() provider to deal with the clock stability.


I think this would depend if mark_tsc_unstable happens after system boot,
specially while running kvm guest?

I don't see anything on the guest side that would mark the kvmclock as
unstable if host's TSC turns unstable post init and since kvmclock
doesn't set CLOCK_SOURCE_MUST_VERIFY, I doubt if a watchdog runs to
verify it in the guest.

I have the following in the guest:

     $ sudo dmesg | grep -i clock
     [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
     [    0.000000] kvm-clock: using sched offset of 423259259 cycles

This means pv_sched_clock is kvm_sched_clock_read from now. and
irqtime is enabled in the guest. right?

So within the guest today ...

$ sudo dmesg | grep -i "clock\|tsc"
[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: using sched offset of 504626078 cycles

# kvm_sched_clock_init() happens here so it can potentially do
# clear_sched_clock_stable() here if !PVCLOCK_TSC_STABLE_BIT.

[ 0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[ 0.000004] tsc: Detected 1996.251 MHz processor

# We enable irqtime here once TSC frequency has been determined
# without considering using_native_sched_clock()


After that TSC is never selected so we don't care if it is stable
or not since it is not the clocksource - the guest continues on
with unstable sched_clock() but also irqtime enabled since TSC
was calibrated successfully.


     [    0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
     [    0.071675] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
     [    0.378467] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
     [    0.388678] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cb1e4d56, max_idle_ns: 881590790753 ns
     [    0.679262] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
     [    0.903121] PTP clock support registered
     [    0.927243] clocksource: Switched to clocksource kvm-clock
     [    0.944986] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
     [    0.993198] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cb1e4d56, max_idle_ns: 881590790753 ns
     [    1.123796] rtc_cmos 00:05: setting system clock to 2026-01-28T07:03:45 UTC (1769583825)
     [    1.155755] sched_clock: Marking stable (940009972, 212965288)->(1171254846, -18279586)
     [    1.712598] clk: Disabling unused clocks

Then I mark TSC unstable on the host

     tsc: Marking TSC unstable due to Faking unreliable TSC!
     TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
     clocksource: Checking clocksource tsc synchronization from CPU 93 to CPUs 0,2,26,75,101,114,118,195.
     sched_clock: Marking unstable (945948313746, 69389667)<-(947618130068, -1600430832)
     clocksource:         CPU 93 check durations 3436ns - 25277ns for clocksource tsc.
     clocksource: Switched to clocksource hpet


so now, using_native_sched_clock should fail in guest? If so, with the patch,
irqtime won't be disabled no?

Ideally yes, but the guest continues using kvmclock without any hitch.
I think the x86 KVM layer has something to ensure stability but I'm
not 100% sure.

Since I don't see "tsc: Marking TSC unstable ..." or "sched_clock:
Marking unstable ..." in the guest, we don't hit the mark_tsc_unstable()
path within the guest which would disable irqtime today so essentially
host's TSC turning changing doesn't seem to affect the guest.



Okay. Fair enough.
Then v7 should cover all scenarios i think. with that,

Reviewed-by: Shrikanth Hegde <sshegde@xxxxxxxxxxxxx>