On Fri, Mar 17, 2017 at 3:58 AM, Tvrtko Ursulin
<tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote:
Hi Andy, all,
I have bisected and verified an interesting performance regression caused by
commit c5552fde102fcc3f2cf9e502b8ac90e3500d8fdf "nvme: Enable autonomous
power state transitions".
Having that patch or not accounts for approx. 3% perf difference in a test
which is, and this is the best part, not even i/o bound by any stretch of
the imagination.
Weird. Is there something in the workload that is periodically
reading or otherwise doing a synchronous IO? APST shouldn't affect
throughput in any meaningful way, but it certainly affects latency of
sporadic IO.
The test is multi-process with overall medium CPU usage and high GPU (Intel)
usage. Average runtime is around 13 seconds during which it writes out
around 14MiB of data.
It does so in chunks during the whole runtime but doesn't do anything
special, just normal O_RDWR | O_CREAT | O_TRUNC so in practice this is all
written to the device only the end of the test run in one chunk. Via the
background write out I suspect.
The 3% mentioned earlier translates to approx. 430ms longer average runtime
with the above patch.
That is surprisingly long. If your device is telling the truth about
its latencies, this should be literally impossible.
ps 0 : mp:9.00W operational enlat:5 exlat:5 rrt:0 rrl:0
rwt:0 rwl:0 idle_power:- active_power:-
This is the normal on state.
ps 1 : mp:4.60W operational enlat:30 exlat:30 rrt:1 rrl:1
rwt:1 rwl:1 idle_power:- active_power:-
ps 2 : mp:3.80W operational enlat:30 exlat:30 rrt:2 rrl:2
rwt:2 rwl:2 idle_power:- active_power:-
These won't be used at all.
ps 3 : mp:0.0700W non-operational enlat:10000 exlat:300 rrt:3 rrl:3
rwt:3 rwl:3 idle_power:- active_power:-
ps 4 : mp:0.0050W non-operational enlat:2000 exlat:10000 rrt:4 rrl:4
rwt:4 rwl:4 idle_power:- active_power:-
These two have their latencies specified in microseconds and will be
used autonomously.
Can you re-test with
/sys/class/nvme/nvme0/power/pm_qos_latency_tolerance_us set to the
following values and report back?
0 - will turn APST off entirely
1 - will leave APST on but will not program any transitions
11000 - should allow state 3 but not state 4