Re: [PATCH v5 05/12] PM / devfreq: Add support for policy notifiers
From: Chanwoo Choi
Date: Wed Aug 01 2018 - 21:59:09 EST
Hi Matthias,
On 2018ë 08ì 02ì 02:08, Matthias Kaehlcke wrote:
> Hi Chanwoo,
>
> On Wed, Aug 01, 2018 at 10:22:16AM +0900, Chanwoo Choi wrote:
>> On 2018ë 08ì 01ì 04:39, Matthias Kaehlcke wrote:
>>> On Mon, Jul 16, 2018 at 10:50:50AM -0700, Matthias Kaehlcke wrote:
>>>> On Thu, Jul 12, 2018 at 05:44:33PM +0900, Chanwoo Choi wrote:
>>>>> Hi Matthias,
>>>>>
>>>>> On 2018ë 07ì 07ì 02:53, Matthias Kaehlcke wrote:
>>>>>> Hi Chanwoo,
>>>>>>
>>>>>> On Wed, Jul 04, 2018 at 03:41:46PM +0900, Chanwoo Choi wrote:
>>>>>>
>>>>>>> Firstly,
>>>>>>> I'm not sure why devfreq needs the devfreq_verify_within_limits() function.
>>>>>>>
>>>>>>> devfreq already used the OPP interface as default. It means that
>>>>>>> the outside of 'drivers/devfreq' can disable/enable the frequency
>>>>>>> such as drivers/thermal/devfreq_cooling.c. Also, when some device
>>>>>>> drivers disable/enable the specific frequency, the devfreq core
>>>>>>> consider them.
>>>>>>>
>>>>>>> So, devfreq doesn't need to devfreq_verify_within_limits() because
>>>>>>> already support some interface to change the minimum/maximum frequency
>>>>>>> of devfreq device.
>>>>>>>
>>>>>>> In case of cpufreq subsystem, cpufreq only provides 'cpufreq_verify_with_limits()'
>>>>>>> to change the minimum/maximum frequency of cpu. some device driver cannot
>>>>>>> change the minimum/maximum frequency through OPP interface.
>>>>>>>
>>>>>>> But, in case of devfreq subsystem, as I explained already, devfreq support
>>>>>>> the OPP interface as default way. devfreq subsystem doesn't need to add
>>>>>>> other way to change the minimum/maximum frequency.
>>>>>>
>>>>>> Using the OPP interface exclusively works as long as a
>>>>>> enabling/disabling of OPPs is limited to a single driver
>>>>>> (drivers/thermal/devfreq_cooling.c). When multiple drivers are
>>>>>> involved you need a way to resolve conflicts, that's the purpose of
>>>>>> devfreq_verify_within_limits(). Please let me know if there are
>>>>>> existing mechanisms for conflict resolution that I overlooked.
>>>>>>
>>>>>> Possibly drivers/thermal/devfreq_cooling.c could be migrated to use
>>>>>> devfreq_verify_within_limits() instead of the OPP interface if
>>>>>> desired, however this seems beyond the scope of this series.
>>>>>
>>>>> Actually, if we uses this approach, it doesn't support the multiple drivers too.
>>>>> If non throttler drivers uses devfreq_verify_within_limits(), the conflict
>>>>> happen.
>>>>
>>>> As long as drivers limit the max freq there is no conflict, the lowest
>>>> max freq wins. I expect this to be the usual case, apparently it
>>>> worked for cpufreq for 10+ years.
>>>>
>>>> However it is correct that there would be a conflict if a driver
>>>> requests a min freq that is higher than the max freq requested by
>>>> another. In this case devfreq_verify_within_limits() resolves the
>>>> conflict by raising p->max to the min freq. Not sure if this is
>>>> something that would ever occur in practice though.
>>>>
>>>> If we are really concerned about this case it would also be an option
>>>> to limit the adjustment to the max frequency.
>>>>
>>>>> To resolve the conflict for multiple device driver, maybe OPP interface
>>>>> have to support 'usage_count' such as clk_enable/disable().
>>>>
>>>> This would require supporting negative usage count values, since a OPP
>>>> should not be enabled if e.g. thermal enables it but the throttler
>>>> disabled it or viceversa.
>>>>
>>>> Theoretically there could also be conflicts, like one driver disabling
>>>> the higher OPPs and another the lower ones, with the outcome of all
>>>> OPPs being disabled, which would be a more drastic conflict resolution
>>>> than that of devfreq_verify_within_limits().
>>>>
>>>> Viresh, what do you think about an OPP usage count?
>>>
>>> Ping, can we try to reach a conclusion on this or at least keep the
>>> discussion going?
>>>
>>> Not that it matters, but my preferred solution continues to be
>>> devfreq_verify_within_limits(). It solves conflicts in some way (which
>>> could be adjusted if needed) and has proven to work in practice for
>>> 10+ years in a very similar sub-system.
>>
>> It is not true. Current cpufreq subsystem doesn't support external OPP
>> control to enable/disable the OPP entry. If some device driver
>> controls the OPP entry of cpufreq driver with opp_disable/enable(),
>> the operation is not working. Because cpufreq considers the limit
>> through 'cpufreq_verify_with_limits()' only.
>
> Ok, we can probably agree that using cpufreq_verify_with_limits()
> exclusively seems to have worked well for cpufreq, and that in their
> overall purpose cpufreq and devfreq are similar subsystems.
>
> The current throttler series with devfreq_verify_within_limits() takes
> the enabled OPPs into account, the lowest and highest OPP are used as
> a starting point for the frequency adjustment and (in theory) the
> frequency range should only be narrowed by
> devfreq_verify_within_limits().
>
>> As I already commented[1], there is different between cpufreq and devfreq.
>> [1] https://lkml.org/lkml/2018/7/4/80
>>
>> Already, subsystem already used OPP interface in order to control
>> specific OPP entry. I don't want to provide two outside method
>> to control the frequency of devfreq driver. It might make the confusion.
>
> I understand your point, it would indeed be preferable to have a
> single method. However I'm not convinced that the OPP interface is
> a suitable solution, as I exposed earlier in this thread (quoted
> below).
>
> I would like you to at least consider the possibility of changing
> drivers/thermal/devfreq_cooling.c to devfreq_verify_within_limits().
> Besides that it's not what is currently used, do you see any technical
> concerns that would make devfreq_verify_within_limits() an unsuitable
> or inferior solution?
As we already discussed, devfreq_verify_within_limits() doesn't support
the multiple outside controllers (e.g., devfreq-cooling.c).
After you are suggesting the throttler core, there are at least two
outside controllers (e.g., devfreq-cooling.c and throttler driver).
As I knew the problem about conflict, I cannot agree the temporary
method. OPP interface is mandatory for devfreq in order to control
the OPP (frequency/voltage). In this situation, we have to try to
find the method through OPP interface.
We can refer to regulator/clock. Multiple device driver can use
the regulator/clock without any problem. I think that usage of OPP
is similiar with regulator/clock. As you mentioned, maybe OPP
would handle the negative count. Although opp_enable/opp_disable()
have to handle the negative count and opp_enable/opp_disable()
can support the multiple usage from device drivers, I think that
this approach is right.
>
>> I want to use only OPP interface to enable/disable frequency
>> even if we have to modify the OPP interface.
>
> These are the concerns I raised earlier about a solution with OPP
> usage counts:
>
> "This would require supporting negative usage count values, since a OPP
> should not be enabled if e.g. thermal enables it but the throttler
> disabled it or viceversa.
Already replied about negative usage count. I think that negative usage count
is not problem if this approach could resolve the issue.
>
> Theoretically there could also be conflicts, like one driver disabling
> the higher OPPs and another the lower ones, with the outcome of all
> OPPs being disabled, which would be a more drastic conflict resolution
> than that of devfreq_verify_within_limits()."
>
> What do you think about these points?
It depends on how to use OPP interface on multiple device driver.
Even if devfreq/opp provides the control method, outside device driver
are misusing them. It is problem of user.
Instead, if we use the OPP interface, we can check why OPP entry
is disabled or enabled through usage count.
>
> The negative usage counts aren't necessarily a dealbreaker in a
> technical sense, though I'm not a friend of quirky interfaces that
> don't behave like a typical user would expect (e.g. an OPP isn't
> necessarily enabled after dev_pm_opp_enable()).
>
> I can sent an RFC with OPP usage counts, though due to the above
> concerns I have doubts it will be well received.
Please add me to Cc list.
>
> Thanks
>
> Matthias
>
>
--
Best Regards,
Chanwoo Choi
Samsung Electronics