Re: [PATCH] cpufreq, intel_pstate, Fix intel_pstate powersave min_perf_pct value

From: Rafael J. Wysocki
Date: Wed Oct 14 2015 - 20:01:22 EST


On Wednesday, October 14, 2015 07:59:38 PM Prarit Bhargava wrote:
>
> On 10/14/2015 05:09 PM, Kristen Carlson Accardi wrote:
> > On Wed, 14 Oct 2015 07:41:59 -0400
> > Prarit Bhargava <prarit@xxxxxxxxxx> wrote:
> >
> >> On systems that initialize the intel_pstate driver with the performance
> >> governor, and then switch to the powersave governor will not transition to
> >> lower cpu frequencies until /sys/devices/system/cpu/intel_pstate/min_perf_pct
> >> is set to a low value.
> >>
> >> The behavior of governor switching changed after commit a04759924e25
> >> ("[cpufreq] intel_pstate: honor user space min_perf_pct override on
> >> resume"). The commit introduced tracking of performance percentage
> >> changes via sysfs in order to restore userspace changes during
> >> suspend/resume. The problem occurs because the global values of the newly
> >> introduced max_sysfs_pct and min_sysfs_pct are not lowered on the governor
> >> change and this causes the powersave governor to inherit the performance
> >> governor's settings.
> >>
> >> A simple change would have been to reset max_sysfs_pct to 100 and
> >> min_sysfs_pct to 0 on a governor change, which fixes the problem with
> >> governor switching. However, since we cannot break userspace[1] the fix
> >> is now to give each governor its own limits storage area so that governor
> >> specific changes are tracked.
> >>
> >> I successfully tested this by booting with both the performance governor
> >> and the powersave governor by default, and switching between the two
> >> governors (while monitoring /sys/devices/system/cpu/intel_pstate/ values,
> >> and looking at the output of cpupower frequency-info). Suspend/Resume
> >> testing was performed by Doug Smythies.
> >>
> >> [1] Systems which suspend/resume using the unmaintained pm-utils package
> >> will always transition to the performance governor before the suspend and
> >> after the resume. This means a system using the powersave governor will
> >> go from powersave to performance, then suspend/resume, performance to
> >> powersave. The simple change during governor changes would have been
> >> overwritten when the governor changed before and after the suspend/resume.
> >> I have submitted https://bugzilla.redhat.com/show_bug.cgi?id=1271225
> >> against Fedora to remove the 94cpufreq file that causes the problem. It
> >> should be noted that pm-utils is obsoleted with newer versions of systemd.
> >>
> >> Cc: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
> >> Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx>
> >> Cc: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
> >> Cc: linux-pm@xxxxxxxxxxxxxxx
> >> Cc: Doug Smythies <dsmythies@xxxxxxxxx>
> >> Signed-off-by: Prarit Bhargava <prarit@xxxxxxxxxx>
> >
> > Acked-by: Kristen Carlson Accardi <kristen@xxxxxxxxxxxxxxx>
> >
> > BTW - I think I can see an issue here with HWP enabled systems. It
> > looks to me like the hwp settings will not be programmed correctly
> > during a governor switch. This probably needs to be addressed in a
> > separate patch.
> >
>
> Oh, I see it now too. I'll get to that in another patch. Thanks for pointing
> that out Kristen.

The $subject patch doesn't apply any more after the series from Srinivas that
I've just applied.

Can you please rebase it on top of my bleeding-edge branch?

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/