Re: brocken devfreq simple_ondemand for Odroid XU3/4?

From: Lukasz Luba
Date: Thu Jun 25 2020 - 06:02:17 EST


Hi Sylwester,

On 6/24/20 4:11 PM, Sylwester Nawrocki wrote:
Hi All,

On 24.06.2020 12:32, Lukasz Luba wrote:
I had issues with devfreq governor which wasn't called by devfreq
workqueue. The old DELAYED vs DEFERRED work discussions and my patches
for it [1]. If the CPU which scheduled the next work went idle, the
devfreq workqueue will not be kicked and devfreq governor won't check
DMC status and will not decide to decrease the frequency based on low
busy_time.
The same applies for going up with the frequency. They both are
done by the governor but the workqueue must be scheduled periodically.

As I have been working on resolving the video mixer IOMMU fault issue
described here: https://patchwork.kernel.org/patch/10861757
I did some investigation of the devfreq operation, mostly on Odroid U3.

My conclusions are similar to what Lukasz says above. I would like to add
that broken scheduling of the performance counters read and the devfreq
updates seems to have one more serious implication. In each call, which
normally should happen periodically with fixed interval we stop the counters,
read counter values and start the counters again. But if period between
calls becomes long enough to let any of the counters overflow, we will
get wrong performance measurement results. My observations are that
the workqueue job can be suspended for several seconds and conditions for
the counter overflow occur sooner or later, depending among others
on the CPUs load.
Wrong bus load measurement can lead to setting too low interconnect bus
clock frequency and then bad things happen in peripheral devices.

I agree the workqueue issue needs to be fixed. I have some WIP code to use
the performance counters overflow interrupts instead of SW polling and with
that the interconnect bus clock control seems to work much better.


Thank you for sharing your use case and investigation results. I think
we are reaching a decent number of developers to maybe address this
issue: 'workqueue issue needs to be fixed'.
I have been facing this devfreq workqueue issue ~5 times in different
platforms.

Regarding the 'performance counters overflow interrupts' there is one
thing worth to keep in mind: variable utilization and frequency.
For example, in order to make a conclusion in algorithm deciding that
the device should increase or decrease the frequency, we fix the period
of observation, i.e. to 500ms. That can cause the long delay if the
utilization of the device suddenly drops. For example we set an
overflow threshold to value i.e. 1000 and we know that at 1000MHz
and full utilization (100%) the counter will reach that threshold
after 500ms (which we want, because we don't want too many interrupts
per sec). What if suddenly utilization drops to 2% (i.e. from 5GB/s
to 250MB/s (what if it drops to 25MB/s?!)), the counter will reach the
threshold after 50*500ms = 25s. It is impossible just for the counters
to predict next utilization and adjust the threshold.
To address that, we still need to have another mechanism (like watchdog)
which will be triggered just to check if the threshold needs adjustment.
This mechanism can be a local timer in the driver or a framework
timer running kind of 'for loop' on all this type of devices (like
the scheduled workqueue). In both cases in the system there will be
interrupts, timers (even at workqueues) and scheduling.
The approach to force developers to implement their local watchdog
timers (or workqueues) in drivers is IMHO wrong and that's why we have
frameworks.

Regards,
Lukasz