Re: [PATCH] acpi_pm: Reduce PMTMR counter read contention
From: Zhenzhong Duan
Date: Wed Jan 30 2019 - 22:49:22 EST
On 2019/1/30 16:06, Thomas Gleixner wrote:
On Tue, 22 Jan 2019, Zhenzhong Duan wrote:
On a large system with many CPUs, using PMTMR as the clock source can
have a significant impact on the overall system performance because
of the following reasons:
1) There is a single PMTMR counter shared by all the CPUs.
2) PMTMR counter reading is a very slow operation.
Using PMTMR as the default clock source may happen when, for example,
the TSC clock calibration exceeds the allowable tolerance and HPET
disabled by nohpet on kernel command line. Sometimes the performance
The question is why would anyone disable HPET on a larger machine when the
TSC is wreckaged?
There may be broken hardware where TSC is wreckaged.
On our instances(X8-8/X7-8), TSC isn't wreckaged. Sometimes we are lucky
to pass the bootup stage, then TSC is the final default clocksource. See
log:
[ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff
max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[ 13.963224] clocksource: jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 1911260446275000 ns
[ 19.903175] clocksource: Switched to clocksource refined-jiffies
[ 20.190467] clocksource: acpi_pm: mask: 0xffffff max_cycles:
0xffffff, max_idle_ns: 2085701024 ns
[ 20.201634] clocksource: Switched to clocksource acpi_pm
[ 39.082577] clocksource: tsc: mask: 0xffffffffffffffff max_cycles:
0x2113ba2fe3c, max_idle_ns: 440795266816 ns
[ 39.138781] clocksource: Switched to clocksource tsc
When we are unlucky, logs:
[ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff
max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[ 19.905741] clocksource: Switched to clocksource refined-jiffies
[ 20.181521] clocksource: acpi_pm: mask: 0xffffff max_cycles:
0xffffff, max_idle_ns: 2085701024 ns
[ 44.273786] watchdog: BUG: soft lockup - CPU#48 stuck for 23s!
[swapper/48:0]
[ 44.279992] watchdog: BUG: soft lockup - CPU#49 stuck for 23s!
[migration/49:307]
So we paniced when acpi_pm is initializing and is chosed as default
clocksource temporarily, it paniced just because we add nohpet parameter.
I'm not against the change per se, but I really want to understand why we
need all the complexity for something which should never be used in a real
world deployment.
Hmm, it's a strong word of "never be used". Customers may happen to use
nohpet(sanity test?) and report bug to us. Sometimes they does report a
bug that reproduce with their customed config. There may also be BIOS
setting HPET disabled.
Thanks
Zhenzhong