Re: [PATCH] acpi_pm: Reduce PMTMR counter read contention

From: Zhenzhong Duan
Date: Fri Feb 01 2019 - 21:51:18 EST


On 2019/1/31 22:26, Thomas Gleixner wrote:
On Thu, 31 Jan 2019, Zhenzhong Duan wrote:

On 2019/1/30 16:06, Thomas Gleixner wrote:
On Tue, 22 Jan 2019, Zhenzhong Duan wrote:

On a large system with many CPUs, using PMTMR as the clock source can
have a significant impact on the overall system performance because
of the following reasons:
1) There is a single PMTMR counter shared by all the CPUs.
2) PMTMR counter reading is a very slow operation.

Using PMTMR as the default clock source may happen when, for example,
the TSC clock calibration exceeds the allowable tolerance and HPET
disabled by nohpet on kernel command line. Sometimes the performance

The question is why would anyone disable HPET on a larger machine when the
TSC is wreckaged?

There may be broken hardware where TSC is wreckaged.

I know that.

I'm not against the change per se, but I really want to understand why we
need all the complexity for something which should never be used in a real
world deployment.

Hmm, it's a strong word of "never be used". Customers may happen to use
nohpet(sanity test?) and report bug to us. Sometimes they does report a bug
that reproduce with their customed config. There may also be BIOS setting HPET
disabled.

And because the customer MAY do completely nonsensical things (and there
are a lot more than the HPET) the kernel has to handle all of them?

Ok, then. I don't have more suggestion to convince you. I just think of a simple fix as below. I think it will work for both hpet and pmtmr. We will test it when the env is available.

--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1353,6 +1353,7 @@ static int change_clocksource(void *data)

write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
+ tick_clock_notify();

return 0;
}
@@ -1371,7 +1372,6 @@ int timekeeping_notify(struct clocksource *clock)
if (tk->tkr_mono.clock == clock)
return 0;
stop_machine(change_clocksource, clock, NULL);
- tick_clock_notify();
return tk->tkr_mono.clock == clock ? 0 : -1;
}


Thanks
Zhenzhong