Re: [PATCH 3/5] sched: cpufreq: call cpufreq hook from remote CPUs

From: Steve Muckle
Date: Thu May 19 2016 - 15:19:25 EST


On Thu, May 19, 2016 at 02:00:54PM +0200, Rafael J. Wysocki wrote:
> On Thu, May 19, 2016 at 1:33 AM, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > On Mon, May 9, 2016 at 11:20 PM, Steve Muckle <steve.muckle@xxxxxxxxxx> wrote:
> >> Without calling the cpufreq hook for a remote wakeup it is possible
> >> for such a wakeup to go unnoticed by cpufreq on the target CPU for up
> >> to a full tick. This can occur if the target CPU is running a
> >> CPU-bound task and preemption does not occur. If preemption does occur
> >> then the scheduler is expected to run soon on the target CPU anyway so
> >> invoking the cpufreq hook on the remote wakeup is not required.
> >>
> >> In order to avoid unnecessary interprocessor communication in the
> >> governor for the preemption case, the existing hook (which happens
> >> before preemption is decided) is only called when the target CPU
> >> is within the current CPU's cpufreq policy. A new hook is added in
> >> check_preempt_curr() to handle events outside the current CPU's
> >> cpufreq policy where preemption does not happen.
> >>
> >> Some governors may opt to not receive remote CPU callbacks. This
> >> behavior is possible by providing NULL as the new policy_cpus
> >> parameter to cpufreq_add_update_util_hook(). Callbacks will only be
> >> issued in this case when the target CPU and the current CPU are the
> >> same. Otherwise policy_cpus is used to determine what is a local
> >> vs. remote callback.
> >
> > I don't really like the extra complexity added by this patch.
> >
> > It makes the code look fragile at some places

Perhaps I can improve these, can you point them out?

> > and it only really matters for schedutil

True. As we are trying to create an integrated scheduler+CPU frequency
control solution I think some of this is to be expected, and should be
worthwhile since this is hopefully the future and will eventually
replace the need for the other governors.

> > and for the fast switch case in there.

Once there is a high priority context for the slow path I expect it to
benefit from this as well.

> >
> > Overall, it looks like a premature optimization to me.

Are you referring to this new approach of avoiding duplicate IPIs, or
supporting updates on remote wakeups overall?

The duplicate IPI thing is admittedly not required for the problem I'm
looking to solve but I figured at least some people would be concerned
about it. If folks can live with it for now then I can go back to the
simpler approach I had in my first posting.

> In particular, the dance with checking the policy CPUs from the
> scheduler is questionable (the scheduler might be interested in this
> information for other purposes too and hooking it up in an ad-hoc way
> just for cpufreq doesn't seem to be appropriate from that perspective)
> and also doesn't seem to be necessary.
>
> You can check if the current CPU is a policy one from the governor and
> if that is the case, simply run the frequency update on it directly
> without any IPI (because if both the target CPU and the current CPU
> belong to the same policy, it doesn't matter which one of them will
> run the update).

The checking of policy CPUs from the scheduler was done so as to
minimize the number of calls to the hook, given their expense.

In the case of a remote update the hook has to run (or not) after it is
known whether preemption will occur so we don't do needless work or
IPIs. If the policy CPUs aren't known in the scheduler then the early
hook will always need to be called along with an indication that it is
the early hook being called. If it turns out to be a remote update it
could then be deferred to the later hook, which would only be called
when a remote update has been deferred and preemption has not occurred.

This means two hook inovcations for a remote non-preempting wakeup
though instead of one. Perhaps this is a good middle ground on code
churn vs. optimization though.

thanks,
Steve