Hi,
Testing
~~~~~~~
Enqueue
^^^^^^^
The impact of wasting cycles during enqueue by using the heuristic in
contrast to always queuing the timer on the local CPU was measured with a
micro benchmark. Therefore a timer is enqueued and dequeued in a loop with
1000 repetitions on a isolated CPU. The time the loop takes is measured. A
quarter of the remaining CPUs was kept busy. This measurement was repeated
several times. With the patch queue the average duration was reduced by
approximately 25%.
145ns plain v6
109ns v6 with patch queue
Furthermore the impact of residence in deep idle states of an idle system
was investigated. The patch queue doesn't downgrade this behavior.
dbench test
^^^^^^^^^^^
A dbench test starting X pairs of client servers are used to create load on
the system. The measurable value is the throughput. The tests were executed
on a zen3 machine. The base is the tip tree branch timers/core which is
based on a v6.6-rc1.
governor menu
X pairs timers/core pull-model impact
----------------------------------------------
1 353.19 (0.19) 353.45 (0.30) 0.07%
2 700.10 (0.96) 687.00 (0.20) -1.87%
4 1329.37 (0.63) 1282.91 (0.64) -3.49%
8 2561.16 (1.28) 2493.56 (1.76) -2.64%
16 4959.96 (0.80) 4914.59 (0.64) -0.91%
32 9741.92 (3.44) 8979.83 (1.13) -7.82%
64 16535.40 (2.84) 16388.47 (4.02) -0.89%
128 22136.83 (2.42) 23174.50 (1.43) 4.69%
256 39256.77 (4.48) 38994.00 (0.39) -0.67%
512 36799.03 (1.83) 38091.10 (0.63) 3.51%
1024 32903.03 (0.86) 35370.70 (0.89) 7.50%
governor teo
X pairs timers/core pull-model impact
----------------------------------------------
1 350.83 (1.27) 352.45 (0.96) 0.46%
2 699.52 (0.85) 690.10 (0.54) -1.35%
4 1339.53 (1.99) 1294.71 (2.71) -3.35%
8 2574.10 (0.76) 2495.46 (1.97) -3.06%
16 4898.50 (1.74) 4783.06 (1.64) -2.36%
32 9115.50 (4.63) 9037.83 (1.58) -0.85%
64 16663.90 (3.80) 16042.00 (1.72) -3.73%
128 25044.93 (1.11) 23250.03 (1.08) -7.17%
256 38059.53 (1.70) 39658.57 (2.98) 4.20%
512 36369.30 (0.39) 38890.13 (0.36) 6.93%
1024 33956.83 (1.14) 35514.83 (0.29) 4.59%
Ping Pong Oberservation
^^^^^^^^^^^^^^^^^^^^^^^
During testing on a mostly idle machine a ping pong game could be observed:
a process_timeout timer is expired remotely on a non idle CPU. Then the CPU
where the schedule_timeout() was executed to enqueue the timer comes out of
idle and restarts the timer using schedule_timeout() and goes back to idle
again. This is due to the fair scheduler which tries to keep the task on
the CPU which it previously executed on.