Re: [RFC/RFT][PATCH v2 0/6] sched/cpuidle: Idle loop rework

From: Mike Galbraith
Date: Thu Mar 08 2018 - 05:31:50 EST


On Tue, 2018-03-06 at 09:57 +0100, Rafael J. Wysocki wrote:
> Hi All,

Greetings,

> Thanks a lot for the discussion so far!
>
> Here's a new version of the series addressing some comments from the
> discussion and (most importantly) replacing patches 4 and 5 with another
> (simpler) patch.

Oddity: these patches seemingly manage to cost a bit of power when
lightly loaded.  (but didn't cut cross core nohz cost much.. darn)

i4790 booted nopti nospectre_v2

30 sec tbench
4.16.0.g1b88acc-master (virgin)
Throughput 559.279 MB/sec 1 clients 1 procs max_latency=0.046 ms
Throughput 997.119 MB/sec 2 clients 2 procs max_latency=0.246 ms
Throughput 1693.04 MB/sec 4 clients 4 procs max_latency=4.309 ms
Throughput 3597.2 MB/sec 8 clients 8 procs max_latency=6.760 ms
Throughput 3474.55 MB/sec 16 clients 16 procs max_latency=6.743 ms

4.16.0.g1b88acc-master (+v2)
Throughput 588.929 MB/sec 1 clients 1 procs max_latency=0.291 ms
Throughput 1080.93 MB/sec 2 clients 2 procs max_latency=0.639 ms
Throughput 1826.3 MB/sec 4 clients 4 procs max_latency=0.647 ms
Throughput 3561.01 MB/sec 8 clients 8 procs max_latency=1.279 ms
Throughput 3382.98 MB/sec 16 clients 16 procs max_latency=4.817 ms

4.16.0.g1b88acc-master (+local nohz mitigation etc for reference [1])
Throughput 722.559 MB/sec 1 clients 1 procs max_latency=0.087 ms
Throughput 1208.59 MB/sec 2 clients 2 procs max_latency=0.289 ms
Throughput 2071.94 MB/sec 4 clients 4 procs max_latency=0.654 ms
Throughput 3784.91 MB/sec 8 clients 8 procs max_latency=0.974 ms
Throughput 3644.4 MB/sec 16 clients 16 procs max_latency=5.620 ms

turbostat -q -- firefox /root/tmp/video/BigBuckBunny-DivXPlusHD.mkv & sleep 300;killall firefox

PkgWatt
1 2 3
4.16.0.g1b88acc-master 6.95 7.03 6.91 (virgin)
4.16.0.g1b88acc-master 7.20 7.25 7.26 (+v2)
4.16.0.g1b88acc-master 6.90 7.06 6.95 (+local)

Why would v2 charge the light firefox load a small but consistent fee?

-Mike

1. see low end, that's largely due to nohz throttling