Re: [PATCH 0/3] cpufreq: Replace timers with utilization update callbacks
From: Rafael J. Wysocki
Date: Wed Feb 03 2016 - 17:19:22 EST
On Friday, January 29, 2016 11:52:15 PM Rafael J. Wysocki wrote:
> Hi,
>
> The following patch series introduces a mechanism allowing the cpufreq core
> and "setpolicy" drivers to provide utilization update callbacks to be invoked
> by the scheduler on utilization changes. Those callbacks can be used to run
> the sampling and frequency adjustments code (intel_pstate) or to schedule the
> execution of that code in process context (cpufreq core) instead of per-CPU
> deferrable timers used in cpufreq today (which Thomas complained about during
> the last Kernel Summit).
>
> [1/3] Introduce a mechanism for calling into cpufreq from the scheduler and
> registering callbacks to be executed from there.
>
> [2/3] Modify intel_pstate to use the mechanism introduced by [1/3] instead
> of per-CPU deferrable timers to do its work.
>
> This isn't entirely straightforward as the scheduler context running those
> callbacks is really special. Among other things it can only use raw
> spinlocks and cannot invoke wake_up_process() directly. Also, calling
> ktime_get() from there may be too expensive on some systems. All that has to
> be taken into account, but even then the change allows some lines of code to be
> cut from the driver.
>
> Some performance and energy consumption measurements have been carried out with
> an earlier version of this patch and it looks like the changes lead to a
> slightly better performing system that consumes slightly less energy at the
> same time overall.
>
> [3/3] Modify the cpufreq core to use the mechanism introduced by [1/3] instead
> of per-CPU deferrable timers to queue up the execution of governor work.
>
> Again, this isn't really straightforward for the above reasons, but still the
> code size is reduced a bit by the changes.
>
> I'm still unsure about the energy consumption and performance impact of [3/3]
> as earlier versions of it led to inconsistent results (most likely due to bugs
> in them that hopefully have been fixed in this version). In particular, the
> additional irq_work may turn out to be problematic, but more optimizations are
> possible on top of this one even if it makes things worse by itself.
>
> For example, it should be possible to move the execution of state selection
> code into the utilization update callback itself, at least in principle, for
> all governors. The P-state/OPP adjustment may need to be run from process
> context still, but for the drivers that can do it without sleeping it should
> be possible to move that into the utilization update callback as well.
>
> The patches are on top of 4.5-rc1 and have been tested on a couple of x86
> machines.
Well, no responses here, so I'm inclined to believe that this series is fine
by everybody (at least by everybody in the CC).
I can wait for a few days more, but new material is starting to pile up on top
of these patches and I'll simply need to move forward at one point.
Thanks,
Rafael