Re: [RFC RESEND 0/3] Introduce cpufreq minimum load QoS

From: Valentin Schneider
Date: Wed May 27 2020 - 11:02:47 EST



On 27/05/20 14:11, Benjamin GAIGNARD wrote:
> On 5/27/20 2:14 PM, Valentin Schneider wrote:
>> On 27/05/20 12:17, Benjamin GAIGNARD wrote:
>>> On 5/27/20 12:09 PM, Valentin Schneider wrote:
>>>> Hi Benjamin,
>>>>
>>>> On 26/05/20 16:16, Benjamin Gaignard wrote:
>>>>> A first round [1] of discussions and suggestions have already be done on
>>>>> this series but without found a solution to the problem. I resend it to
>>>>> progress on this topic.
>>>>>
>>>> Apologies for sleeping on that previous thread.
>>>>
>>>> So what had been suggested over there was to use uclamp to boost the
>>>> frequency of the handling thread; however if you use threaded IRQs you
>>>> get RT threads, which already get the max frequency by default (at least
>>>> with schedutil).
>>>>
>>>> Does that not work for you, and if so, why?
>>> That doesn't work because almost everything is done by the hardware blocks
>>> without charge the CPU so the thread isn't running.
>> I'm not sure I follow; the frequency of the CPU doesn't matter while
>> your hardware blocks are spinning, right? AIUI what matters is running
>> your interrupt handler / action at max freq, which you get if you use
>> threaded IRQs and schedutil.
> Yes but not limited to schedutil.
> Given the latency needed to change of frequencies I think it could
> already too late
> to change the CPU frequency when handling the threaded interrupt.

Right, on my Juno the transition latency (i.e. worse case) is about
1.2ms; I can see that eating into your time budget, depending on the
framerate you're going for.

Vincent's got a point, if you can limit that max-freq-hold to a single
frequency domain, that would probably be a tad better.

Thanks for persisting through my questioning :-)

>>
>> I think it would help if you could clarify which tasks / parts of your
>> pipeline you need running at high frequencies. The point is that setting
>> a QoS request affects all tasks, whereas we could be smarter and only
>> boost the required tasks.
> What make us drop frames is that the threaded IRQ is scheduled too late.
> The not thread part of the interrupt handler where we clear the
> interrupt flags
> is going fine but the thread part not.
>>
>>> I have done the
>>> tests with schedutil
>>> and ondemand scheduler (which is the one I'm targeting). I have no
>>> issues when using
>>> performance scheduler because it always keep the highest frequencies.
>>>
>>>
>>>>> When start streaming from the sensor the CPU load could remain very low
>>>>> because almost all the capture pipeline is done in hardware (i.e. without
>>>>> using the CPU) and let believe to cpufreq governor that it could use lower
>>>>> frequencies. If the governor decides to use a too low frequency that
>>>>> becomes a problem when we need to acknowledge the interrupt during the
>>>>> blanking time.
>>>>> The delay to ack the interrupt and perform all the other actions before
>>>>> the next frame is very short and doesn't allow to the cpufreq governor to
>>>>> provide the required burst of power. That led to drop the half of the frames.
>>>>>
>>>>> To avoid this problem, DCMI driver informs the cpufreq governors by adding
>>>>> a cpufreq minimum load QoS resquest.
>>>>>
>>>>> Benjamin
>>>>>
>>>>> [1] https://lkml.org/lkml/2020/4/24/360
>>>>>
>>>>> Benjamin Gaignard (3):
>>>>> PM: QoS: Introduce cpufreq minimum load QoS
>>>>> cpufreq: governor: Use minimum load QoS
>>>>> media: stm32-dcmi: Inform cpufreq governors about cpu load needs
>>>>>
>>>>> drivers/cpufreq/cpufreq_governor.c | 5 +
>>>>> drivers/media/platform/stm32/stm32-dcmi.c | 8 ++
>>>>> include/linux/pm_qos.h | 12 ++
>>>>> kernel/power/qos.c | 213 ++++++++++++++++++++++++++++++
>>>>> 4 files changed, 238 insertions(+)