RE: [RFT][PATCH v5 0/7] sched/cpuidle: Idle loop rework

From: Doug Smythies
Date: Tue Mar 20 2018 - 17:04:00 EST


Summary: My results with kernel 4.16-rc6 and V8 of the patch set
are completely different, and now show no clear difference
(a longer test might reveal something).

On 2018.03.20 10:16 Doug Smythies wrote:
> On 2018.03.20 03:02 Thomas Ilsche wrote:
>
>...[snip]...
>
>> Consider the Skylake server system which has residencies in C1E of
>> 20 us and C6 of 800 us. I use a small while(1) {usleep(300);}
>> unsynchronized pinned to each core. While this is an artificial
>> case, it is a very innocent one - easy to predict and regular. Between
>> vanilla 4.16.0-rc5 and idle-loop/v6, the power consumption increases
>> from 149.7 W to 158.1 W. On 4.16.0-rc5, the cores sleep almost
>> entirely in C1E. With the patches applied, the cores spend ~75% of
>> their sleep time in C6, ~25% in C1E. The average time/usage for C1E is
>> also lower with v6 at ~350 us rather than the ~550 us in C6 (and in
>> C1E with the baseline). Generally the new menu governor seems to chose
>> C1E if the next timer is an enabled sched timer - which occasionally
>> interrupts the sleep-interval into two C1E sleeps rather than one C6.
>>
>> Manually disabling C6, reduces power consumption back to 149.5 W.
>
> ...[snip]...
>
> Note that one of the tests that I normally do is a work/sleep
> frequency sweep from 100 to 2100 Hz, typically at a lowish
> workload. I didn't notice anything odd with this test:
>
> http://fast.smythies.com/rjw_freq_sweep.png
>
> However, your test is at 3333 Hz (well, minus overheads).
> I did the same as you. And was surprised to confirm
> your power findings. In my case package power goes from
> ~8.6 watts to ~7.3 watts with idle state 4 (C6) disabled.
>
> I am getting different residency times than you though.
> I also observe different overheads between idle state 4
> being disabled or not. i.e. my actual loop frequency
> drops from ~2801 Hz to ~2754 Hz.
>
> Example residencies over the previous minute:
>
> Idle state 4 (C6) disabled (seconds):
>
> Idle state 0: 0.001119
> Idle state 1: 0.056638
> Idle state 2: 13.100550
> Idle state 3: 446.266744
> Idle state 4: 0.000000
>
> Idle state 4 (C6) enabled (seconds):
>
> Idle state 0: 0.034502
> Idle state 1: 1.949595
> Idle state 2: 78.291793
> Idle state 3: 96.467974
> Idle state 4: 286.247524

Now, with kernel 4.16-rc6 and V8 of the patch set and the poll fix
I am unable to measure the processor package power difference
between idle state 0 enabled or disabled (i.e. it is in the noise).
also the loop time changes (overhead changes) are minimal. However,
the overall loop time has dropped to ~2730 Hz, so there seems to be
a little more overhead in general.

I increased my loop frequency to ~3316 Hz. Similar.

I increased my loop frequency to ~15474 Hz. Similar.
Compared to a stock 4.16-rc6 kernel: The loop rate dropped
to 15,209 Hz and it (the stock kernel) used about 0.3 more
watts (out of 10.97, or ~3% more).

... Doug