Re: [RFC][PATCH 7/7] cpufreq: intel_pstate: Change P-state selection algorithm for Core
From: Peter Zijlstra
Date: Fri Aug 19 2016 - 10:47:48 EST
On Sat, Aug 13, 2016 at 08:59:01AM -0700, Doug Smythies wrote:
> My previous replies (and see below) have suggested that some filtering
> is needed on the target pstate, otherwise, and dependant on the type of
> workload, it tends to oscillate.
>
> I added the IIR (Infinite Impulse Response) filter that I have suggested in the past:
One question though; why is this filter intel_pstate specific? Should we
not do this in generic code?
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index c43ef55..262ec5f 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -1313,7 +1318,74 @@ static inline int32_t get_target_pstate_default(struct cpudata *cpu)
> cpu->iowait_boost >>= 1;
>
> pstate = cpu->pstate.turbo_pstate;
> + unfiltered_target = (pstate + (pstate >> 2)) * busy_frac;
> + duration_ns = cpu->sample.time - cpu->last_sample_time;
> +
> + scaled_gain = div_u64(int_tofp(duration_ns) *
> + int_tofp(pid_params.p_gain_pct), int_tofp(pid_params.sample_rate_ns));
Drop int_to_fp() on one of the dividend terms and in the divisor. Same
end result since they divide away against one another but reduces the
risk of overflow.
Also, sample_rate_ns, really!? A rate is in [1/s], should that thing be
called period_ns ?
> + if (scaled_gain > int_tofp(100))
> + scaled_gain = int_tofp(100);
> + if (scaled_gain < int_tofp(pid_params.p_gain_pct))
> + scaled_gain = int_tofp(pid_params.p_gain_pct);
> +
> + /*
> + * Bandwidth limit the output. For now, re-task p_gain_pct for this purpose.
> + * Use a smple IIR (Infinite Impulse Response) filter.
> + */
> + cpu->sample.target = div_u64((int_tofp(100) - scaled_gain) *
> + cpu->sample.target + scaled_gain *
> + unfiltered_target, int_tofp(100));
Really hard to read that stuff, maybe cure with a comment:
/*
* g = dt*p / period
*
* target' = (1 - g)*target + g*input
*/
> +
> + return fp_toint(cpu->sample.target + (1 << (FRAC_BITS-1)));
> }