Re: [PATCH v2] schedutil: Allow cpufreq requests to be made even when kthread kicked

From: Rafael J. Wysocki
Date: Tue May 22 2018 - 12:13:22 EST


On Tue, May 22, 2018 at 5:30 PM, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> On Tue, May 22, 2018 at 12:02 PM, Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
>> On Mon, May 21, 2018 at 6:13 PM, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>>> On Mon, May 21, 2018 at 10:29:52AM +0200, Rafael J. Wysocki wrote:
>>>> On Mon, May 21, 2018 at 7:14 AM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
>>>> > On 18-05-18, 11:55, Joel Fernandes (Google.) wrote:
>>>> >> From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>
>>>> >>
>>>> >> Currently there is a chance of a schedutil cpufreq update request to be
>>>> >> dropped if there is a pending update request. This pending request can
>>>> >> be delayed if there is a scheduling delay of the irq_work and the wake
>>>> >> up of the schedutil governor kthread.
>>>> >>
>>>> >> A very bad scenario is when a schedutil request was already just made,
>>>> >> such as to reduce the CPU frequency, then a newer request to increase
>>>> >> CPU frequency (even sched deadline urgent frequency increase requests)
>>>> >> can be dropped, even though the rate limits suggest that its Ok to
>>>> >> process a request. This is because of the way the work_in_progress flag
>>>> >> is used.
>>>> >>
>>>> >> This patch improves the situation by allowing new requests to happen
>>>> >> even though the old one is still being processed. Note that in this
>>>> >> approach, if an irq_work was already issued, we just update next_freq
>>>> >> and don't bother to queue another request so there's no extra work being
>>>> >> done to make this happen.
>>>> >
>>>> > Now that this isn't an RFC anymore, you shouldn't have added below
>>>> > paragraph here. It could go to the comments section though.
>>>> >
>>>> >> I had brought up this issue at the OSPM conference and Claudio had a
>>>> >> discussion RFC with an alternate approach [1]. I prefer the approach as
>>>> >> done in the patch below since it doesn't need any new flags and doesn't
>>>> >> cause any other extra overhead.
>>>> >>
>>>> >> [1] https://patchwork.kernel.org/patch/10384261/
>>>> >>
>>>> >> LGTMed-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
>>>> >> LGTMed-by: Juri Lelli <juri.lelli@xxxxxxxxxx>
>>>> >
>>>> > Looks like a Tag you just invented ? :)
>>>>
>>>> Yeah.
>>>>
>>>> The LGTM from Juri can be converned into an ACK silently IMO. That
>>>> said I have added Looks-good-to: tags to a couple of commits. :-)
>>>
>>> Cool, I'll covert them to Acks :-)
>>
>> So it looks like I should expect an update of this patch, right?
>>
>> Or do you prefer the current one to be applied and work on top of it?
>>
>
> [cut]
>
>>>
>>> I just realized that on a single policy switch that uses the governor thread,
>>> there will be 1 thread per-CPU. The sugov_update_single will be called on the
>>> same CPU with interrupts disabled.
>>
>> sugov_update_single() doesn't have to run on the target CPU.
>
> Which sadly is a bug IMO. :-/

My bad.

sugov_update_single() runs under rq->lock, so it need not run on a
target CPU so long as the CPU running it can update the frequency for
the target and there is the requisite check for that in
sugov_should_update_freq().

That means that sugov_update_single() will not run concurrently on two
different CPUs for the same target, but it may be running concurrently
with the kthread (as pointed out by Viresh).

>>> In sugov_work, we are doing a
>>> raw_spin_lock_irqsave which also disables interrupts. So I don't think
>>> there's any possibility of a race happening on the same CPU between the
>>> frequency update request and the sugov_work executing. In other words, I feel
>>> we can drop the above if (..) statement for single policies completely and
>>> only keep the changes for the shared policy. Viresh since you brought up the
>>> single policy issue initially which made me add this if statememnt, could you
>>> let me know if you agree with what I just said?
>>
>> Which is why you need the spinlock too.
>
> And you totally have a point.

Not really, sorry about that.

It is necessary to take the spinlock in the non-fast-switch case,
because of the possible race with the kthread, so something like my
patch at https://patchwork.kernel.org/patch/10418551/ is needed after
all.