Re: [PATCH v2] schedutil: Allow cpufreq requests to be made even when kthread kicked

From: Rafael J. Wysocki
Date: Mon May 21 2018 - 03:35:03 EST


On Mon, May 21, 2018 at 7:14 AM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> On 18-05-18, 11:55, Joel Fernandes (Google.) wrote:
>> From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>
>>
>> Currently there is a chance of a schedutil cpufreq update request to be
>> dropped if there is a pending update request. This pending request can
>> be delayed if there is a scheduling delay of the irq_work and the wake
>> up of the schedutil governor kthread.
>>
>> A very bad scenario is when a schedutil request was already just made,
>> such as to reduce the CPU frequency, then a newer request to increase
>> CPU frequency (even sched deadline urgent frequency increase requests)
>> can be dropped, even though the rate limits suggest that its Ok to
>> process a request. This is because of the way the work_in_progress flag
>> is used.
>>
>> This patch improves the situation by allowing new requests to happen
>> even though the old one is still being processed. Note that in this
>> approach, if an irq_work was already issued, we just update next_freq
>> and don't bother to queue another request so there's no extra work being
>> done to make this happen.
>
> Now that this isn't an RFC anymore, you shouldn't have added below
> paragraph here. It could go to the comments section though.
>
>> I had brought up this issue at the OSPM conference and Claudio had a
>> discussion RFC with an alternate approach [1]. I prefer the approach as
>> done in the patch below since it doesn't need any new flags and doesn't
>> cause any other extra overhead.
>>
>> [1] https://patchwork.kernel.org/patch/10384261/
>>
>> LGTMed-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
>> LGTMed-by: Juri Lelli <juri.lelli@xxxxxxxxxx>
>
> Looks like a Tag you just invented ? :)

Yeah.

The LGTM from Juri can be converned into an ACK silently IMO. That
said I have added Looks-good-to: tags to a couple of commits. :-)

>> CC: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
>> CC: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>> CC: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> CC: Ingo Molnar <mingo@xxxxxxxxxx>
>> CC: Patrick Bellasi <patrick.bellasi@xxxxxxx>
>> CC: Juri Lelli <juri.lelli@xxxxxxxxxx>
>> Cc: Luca Abeni <luca.abeni@xxxxxxxxxxxxxxx>
>> CC: Joel Fernandes <joelaf@xxxxxxxxxx>
>> CC: Todd Kjos <tkjos@xxxxxxxxxx>
>> CC: claudio@xxxxxxxxxxxxxxx
>> CC: kernel-team@xxxxxxxxxxx
>> CC: linux-pm@xxxxxxxxxxxxxxx
>> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
>> ---
>> v1 -> v2: Minor style related changes.
>>
>> kernel/sched/cpufreq_schedutil.c | 34 ++++++++++++++++++++++++--------
>> 1 file changed, 26 insertions(+), 8 deletions(-)
>>
>> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> index e13df951aca7..5c482ec38610 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>> !cpufreq_can_do_remote_dvfs(sg_policy->policy))
>> return false;
>>
>> - if (sg_policy->work_in_progress)
>> - return false;
>> -
>> if (unlikely(sg_policy->need_freq_update)) {
>> sg_policy->need_freq_update = false;
>> /*
>> @@ -128,7 +125,7 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
>>
>> policy->cur = next_freq;
>> trace_cpu_frequency(next_freq, smp_processor_id());
>> - } else {
>> + } else if (!sg_policy->work_in_progress) {
>> sg_policy->work_in_progress = true;
>> irq_work_queue(&sg_policy->irq_work);
>> }
>> @@ -291,6 +288,13 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
>>
>> ignore_dl_rate_limit(sg_cpu, sg_policy);
>>
>> + /*
>> + * For slow-switch systems, single policy requests can't run at the
>> + * moment if update is in progress, unless we acquire update_lock.
>> + */
>> + if (sg_policy->work_in_progress)
>> + return;
>> +
>
> I would still want this to go away :)
>
> @Rafael, will it be fine to get locking in place for unshared policy
> platforms ?

As long as it doesn't affect the fast switch path in any way.

>
>> if (!sugov_should_update_freq(sg_policy, time))
>> return;
>>
>> @@ -382,13 +386,27 @@ sugov_update_shared(struct update_util_data *hook, u64 time, unsigned int flags)
>> static void sugov_work(struct kthread_work *work)
>> {
>> struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
>> + unsigned int freq;
>> + unsigned long flags;
>> +
>> + /*
>> + * Hold sg_policy->update_lock shortly to handle the case where:
>> + * incase sg_policy->next_freq is read here, and then updated by
>> + * sugov_update_shared just before work_in_progress is set to false
>> + * here, we may miss queueing the new update.
>> + *
>> + * Note: If a work was queued after the update_lock is released,
>> + * sugov_work will just be called again by kthread_work code; and the
>> + * request will be proceed before the sugov thread sleeps.
>> + */
>> + raw_spin_lock_irqsave(&sg_policy->update_lock, flags);
>> + freq = sg_policy->next_freq;
>> + sg_policy->work_in_progress = false;
>> + raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);
>>
>> mutex_lock(&sg_policy->work_lock);
>> - __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
>> - CPUFREQ_RELATION_L);
>> + __cpufreq_driver_target(sg_policy->policy, freq, CPUFREQ_RELATION_L);
>> mutex_unlock(&sg_policy->work_lock);
>> -
>> - sg_policy->work_in_progress = false;
>> }
>>
>> static void sugov_irq_work(struct irq_work *irq_work)
>
> Fix the commit log and you can add my

I can fix it up.

> Acked-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>

Thanks,
Rafael