RE: [PATCH 1/1] intel_pstate: Increase hold-off time before busyness is scaled

From: Doug Smythies
Date: Thu Feb 25 2016 - 14:51:28 EST

Next message: Andy Lutomirski: "Re: [tip:x86/urgent] x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32"
Previous message: Brian Gerst: "Re: [tip:x86/urgent] x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32"
In reply to: Stephane Gasparini: "Re: [PATCH 1/1] intel_pstate: Increase hold-off time before busyness is scaled"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Steph,

On 2016.02.24 08:20 Stephane Gasparini wrote:
>> On Feb 19, 2016, at 5:38 PM, Doug Smythies <dsmythies@xxxxxxxxx> wrote:
>>> On 2016.02.19 03:12 Stephane Gasparini wrote:
>>>
>>> The issue you are reporting looks like one we improved on android by using
>>> the average pstate instead of using the last requested pstate
>>>
>>> We know that this is improving the ffmpeg encoding performance when using the
>>> load algorithm.
>>>
>>> see patch attached
>>>
>>> This patch is only applied on get_target_pstate_use_cpu_load however you can give
>>> it a try on get_target_pstate_use_performance
>>
>> Yes, that type of patch works on the load based approach.
>
> Iâm not talking about using average p-state in the scaled_busy computation.
> Iâm talking adding the output of the PID (the number of pstate to ad or subtract)
> to the average pstate rather than adding this to the current p-sate.

For the situation we are dealing with here, that would actually make it worse,
wouldn't it?

Let's work through a real very low load example from the Mel V2 patch where
the target pstate is increased whereas it should have been decreased:

Mel patch version 2 (12X hold off added to rjw 3 patch v10 set added to kernel 4.5-rc4):

CPU: 3
Core busy: 105
Scaled busy: 143
Old pstate: 25
New pstate: 34
mperf: 52039
aperf: 55097
tsc: 335265689
freq: 3599750 KHz
Load: 0.02%
Duration (mS): 98.293

New pstate = old pstate + (scaled_busy-setpoint) * p_gain
= 25 + (143 - 97) * 0.2
= 34 (as above)

Ave pstate = max_pstate * aperf / mperf
= 34 * 55097 / 52039
= 36

Steph average pstate method added to the above:
New pstate = ave pstate + (scaled_busy-setpoint) * p_gain
= 36 + (143 - 97) * 0.2
= 45 (before clamping)

Now, just for completeness show the no Mel patch math:
Scaled busy = Core busy * max_pstate / old pstate * sample time / duration
= 105 * 34 / 25 * 10 / 98.293
= 14.53
New pstate = old pstate + (scaled_busy-setpoint) * p_gain
= 25 + (14.53 - 97) * .2
= 8.5
= 16 clamped minimum

Regardless, I coded the average pstate method and observe little
difference between it and the Mel V2 patch with limited testing.

... Doug

Next message: Andy Lutomirski: "Re: [tip:x86/urgent] x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32"
Previous message: Brian Gerst: "Re: [tip:x86/urgent] x86/entry/32: Add an ASM_CLAC to entry_SYSENTER_32"
In reply to: Stephane Gasparini: "Re: [PATCH 1/1] intel_pstate: Increase hold-off time before busyness is scaled"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]