Re: [PATCH v7] sched/clock: Avoid false sharing for sched_clock_irqtime
From: Guo, Wangyang
Date: Tue Jan 27 2026 - 21:19:36 EST
On 1/27/2026 7:04 PM, Shrikanth Hegde wrote:
Hi Wangyang, Prateek.
On 1/27/26 12:55 PM, Wangyang Guo wrote:
Read-mostly sched_clock_irqtime may share the same cacheline with
frequently updated nohz struct. Make it as static_key to avoid
false sharing issue.
The only user of disable_sched_clock_irqtime()
is tsc_.*mark_unstable() which may be invoked under atomic context
and require a workqueue to disable static_key. But both of them
calls clear_sched_clock_stable() just before doing
disable_sched_clock_irqtime(). We can reuse
"sched_clock_work" to also disable sched_clock_irqtime().
One additional case need to handle is if the tsc is marked unstable
before late_initcall() phase, sched_clock_work will not be invoked
and sched_clock_irqtime will stay enabled although clock is unstable:
tsc_init()
enable_sched_clock_irqtime() # irqtime accounting is enabled here
...
if (unsynchronized_tsc()) # true
mark_tsc_unstable()
clear_sched_clock_stable()
__sched_clock_stable_early = 0;
...
if (static_key_count(&sched_clock_running.key) == 2)
# Only happens at sched_clock_init_late()
__clear_sched_clock_stable(); # Never executed
...
# late_initcall() phase
sched_clock_init_late()
if (__sched_clock_stable_early) # Already false
__set_sched_clock_stable(); # sched_clock is never marked stable
# TSC unstable, but sched_clock_work won't run to disable irqtime
So we need to disable_sched_clock_irqtime() in sched_clock_init_late()
if clock is unstable.
Do you this as a valid case? have you tested with CONFIG_PARAVIRT?
Lets say you have a non native sched clock such as kvm_sched_clock_read.
tsc_init -> sets enable_sched_clock_irqtime()
->mark_tsc_unstable -> if using_native_sched_clock -> clear_sched_clock_stable
In this case, since clear_sched_clock_stable won't be called you may not disable the
sched clock irqtime since __sched_clock_stable_early is reset only in clear_sched_clock_stable
For hypervisor, I see this path may call clear_sched_clock_stable when clock is unstable at init:
kvm_init_platform() ->
kvmclock_init() -> kvm_sched_clock_init(stable):
if (!stable) clear_sched_clock_stable()
paravirt_set_sched_clock(kvm_sched_clock_read)
Bigger concern(maybe) is clock source marked as stable still, though called mark_tsc_unstable in
non native sched clock?
Disclaimer: (just curious, seeing this x86 code for first time, so may not know all paths)
Yes, when clock mark unstable through tsc_.*mark_unstable() with non-native_sched_clock, clear_sched_clock_stable won't be called, thus sched_clock_irqtime still keep enabled.
Maybe the dedicated workqueue for sched_clock_irqtime is still needed considering this case.