On a large system with many CPUs, using HPET as the clock source can
have a significant impact on the overall system performance because
of the following reasons:
1) There is a single HPET counter shared by all the CPUs.
2) HPET counter reading is a very slow operation.
Using HPET as the default clock source may happen when, for example,
the TSC clock calibration exceeds the allowable tolerance. Something
the performance slowdown can be so severe that the system may crash
because of a NMI watchdog soft lockup, for example.
This patch attempts to reduce HPET read contention by using the fact
that if more than one CPUs are trying to access HPET at the same time,
it will be more efficient if one CPU in the group reads the HPET
counter and shares it with the rest of the group instead of each
group member reads the HPET counter individually.
This is done by using a combination word with a sequence number and
a bit lock. The CPU that gets the bit lock will be responsible for
reading the HPET counter and update the sequence number. The others
will monitor the change in sequence number and grab the HPET counter
accordingly. This change is enabled on SMP configuration.
On a 4-socket Haswell-EX box with 72 cores (HT off), running the
AIM7 compute workload (1500 users) on a 4.6-rc1 kernel (HZ=1000)
with and without the patch has the following performance numbers
(with HPET or TSC as clock source):
TSC = 646515 jobs/min
HPET w/o patch = 566708 jobs/min
HPET with patch = 638791 jobs/min
The perf profile showed a reduction of the %CPU time consumed by
read_hpet from 4.99% without patch to 1.41% with patch.
On a 16-socket IvyBridge-EX system with 240 cores (HT on), on the
other hand, the performance numbers of the same benchmark were:
TSC = 3145329 jobs/min
HPET w/o patch = 1108537 jobs/min
HPET with patch = 3019934 jobs/min
The corresponding perf profile showed a drop of CPU consumption of
the read_hpet function from more than 34% to just 2.96%.
Signed-off-by: Waiman Long<Waiman.Long@xxxxxxx>
---
v3->v4:
- Move hpet_save inside the CONFIG_SMP block to fix a compilation
warning in non-SMP build.
v2->v3:
- Make the hpet optimization the default for SMP configuration. So
no documentation change is needed.
- Remove threshold checking code as it should not be necessary and
can be potentially unsafe.
v1->v2:
- Reduce the CPU threshold to 32.
- Add a kernel parameter to explicitly enable or disable hpet
optimization.
- Change hpet_save.hpet type to u32 to make sure that read& write
is atomic on i386.
arch/x86/kernel/hpet.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 84 insertions(+), 0 deletions(-)