Re: [PATCH v2] schedutil: Allow cpufreq requests to be made even when kthread kicked

From: Joel Fernandes
Date: Mon May 21 2018 - 11:19:06 EST


On Mon, May 21, 2018 at 10:29:52AM +0200, Rafael J. Wysocki wrote:
> On Mon, May 21, 2018 at 7:14 AM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> > On 18-05-18, 11:55, Joel Fernandes (Google.) wrote:
> >> From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>
> >>
> >> Currently there is a chance of a schedutil cpufreq update request to be
> >> dropped if there is a pending update request. This pending request can
> >> be delayed if there is a scheduling delay of the irq_work and the wake
> >> up of the schedutil governor kthread.
> >>
> >> A very bad scenario is when a schedutil request was already just made,
> >> such as to reduce the CPU frequency, then a newer request to increase
> >> CPU frequency (even sched deadline urgent frequency increase requests)
> >> can be dropped, even though the rate limits suggest that its Ok to
> >> process a request. This is because of the way the work_in_progress flag
> >> is used.
> >>
> >> This patch improves the situation by allowing new requests to happen
> >> even though the old one is still being processed. Note that in this
> >> approach, if an irq_work was already issued, we just update next_freq
> >> and don't bother to queue another request so there's no extra work being
> >> done to make this happen.
> >
> > Now that this isn't an RFC anymore, you shouldn't have added below
> > paragraph here. It could go to the comments section though.
> >
> >> I had brought up this issue at the OSPM conference and Claudio had a
> >> discussion RFC with an alternate approach [1]. I prefer the approach as
> >> done in the patch below since it doesn't need any new flags and doesn't
> >> cause any other extra overhead.
> >>
> >> [1] https://patchwork.kernel.org/patch/10384261/
> >>
> >> LGTMed-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
> >> LGTMed-by: Juri Lelli <juri.lelli@xxxxxxxxxx>
> >
> > Looks like a Tag you just invented ? :)
>
> Yeah.
>
> The LGTM from Juri can be converned into an ACK silently IMO. That
> said I have added Looks-good-to: tags to a couple of commits. :-)

Cool, I'll covert them to Acks :-)

[..]
> >> v1 -> v2: Minor style related changes.
> >>
> >> kernel/sched/cpufreq_schedutil.c | 34 ++++++++++++++++++++++++--------
> >> 1 file changed, 26 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> >> index e13df951aca7..5c482ec38610 100644
> >> --- a/kernel/sched/cpufreq_schedutil.c
> >> +++ b/kernel/sched/cpufreq_schedutil.c
> >> @@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> >> !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> >> return false;
> >>
> >> - if (sg_policy->work_in_progress)
> >> - return false;
> >> -
> >> if (unlikely(sg_policy->need_freq_update)) {
> >> sg_policy->need_freq_update = false;
> >> /*
> >> @@ -128,7 +125,7 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> >>
> >> policy->cur = next_freq;
> >> trace_cpu_frequency(next_freq, smp_processor_id());
> >> - } else {
> >> + } else if (!sg_policy->work_in_progress) {
> >> sg_policy->work_in_progress = true;
> >> irq_work_queue(&sg_policy->irq_work);
> >> }
> >> @@ -291,6 +288,13 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
> >>
> >> ignore_dl_rate_limit(sg_cpu, sg_policy);
> >>
> >> + /*
> >> + * For slow-switch systems, single policy requests can't run at the
> >> + * moment if update is in progress, unless we acquire update_lock.
> >> + */
> >> + if (sg_policy->work_in_progress)
> >> + return;
> >> +
> >
> > I would still want this to go away :)
> >
> > @Rafael, will it be fine to get locking in place for unshared policy
> > platforms ?
>
> As long as it doesn't affect the fast switch path in any way.

I just realized that on a single policy switch that uses the governor thread,
there will be 1 thread per-CPU. The sugov_update_single will be called on the
same CPU with interrupts disabled. In sugov_work, we are doing a
raw_spin_lock_irqsave which also disables interrupts. So I don't think
there's any possibility of a race happening on the same CPU between the
frequency update request and the sugov_work executing. In other words, I feel
we can drop the above if (..) statement for single policies completely and
only keep the changes for the shared policy. Viresh since you brought up the
single policy issue initially which made me add this if statememnt, could you
let me know if you agree with what I just said?

> >> if (!sugov_should_update_freq(sg_policy, time))
> >> return;
> >>
> >> @@ -382,13 +386,27 @@ sugov_update_shared(struct update_util_data *hook, u64 time, unsigned int flags)
> >> static void sugov_work(struct kthread_work *work)
> >> {
> >> struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
> >> + unsigned int freq;
> >> + unsigned long flags;
> >> +
> >> + /*
> >> + * Hold sg_policy->update_lock shortly to handle the case where:
> >> + * incase sg_policy->next_freq is read here, and then updated by
> >> + * sugov_update_shared just before work_in_progress is set to false
> >> + * here, we may miss queueing the new update.
> >> + *
> >> + * Note: If a work was queued after the update_lock is released,
> >> + * sugov_work will just be called again by kthread_work code; and the
> >> + * request will be proceed before the sugov thread sleeps.
> >> + */
> >> + raw_spin_lock_irqsave(&sg_policy->update_lock, flags);
> >> + freq = sg_policy->next_freq;
> >> + sg_policy->work_in_progress = false;
> >> + raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);
> >>
> >> mutex_lock(&sg_policy->work_lock);
> >> - __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> >> - CPUFREQ_RELATION_L);
> >> + __cpufreq_driver_target(sg_policy->policy, freq, CPUFREQ_RELATION_L);
> >> mutex_unlock(&sg_policy->work_lock);
> >> -
> >> - sg_policy->work_in_progress = false;
> >> }
> >>
> >> static void sugov_irq_work(struct irq_work *irq_work)
> >
> > Fix the commit log and you can add my
>
> I can fix it up.
>
> > Acked-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>

Thanks!

- Joel