Re: [PATCH v2 1/3] PM: Introduce DEVFREQ: generic DVFS framework withdevice-specific OPPs
From: MyungJoo Ham
Date: Wed May 18 2011 - 04:22:53 EST
2011/5/18 Rafael J. Wysocki <rjw@xxxxxxx>:
> Hi,
>
> On Wednesday, May 11, 2011, MyungJoo Ham wrote:
[]
>> +/*
>> + * devfreq_work periodically (given by DEVFREQ_INTERVAL) monitors every
>> + * registered device.
>> + */
>
> I'm not sure what that means. ÂDoes it happen only if monitoring is 'true'?
> If so, perhaps it could be called 'polling'?
>
Yes, it is correct. I'll change the name to polling.
>> +static bool monitoring;
[]
>> + Â Â Â Â Â Â Â Â Â Â devfreq = tmp_devfreq;
>
> Well, it looks like you could simply do "return tmp_devfreq" here
> (then you'd only need one local pointer).
>
>> + Â Â return devfreq;
>
> And return ERR_PTR(-ENODEV) here.
Good. I'll do that.
>
>> +}
>> +
>> +#define dev_dbg_once(dev, fmt, ...) Â Â Â Â Â Â Â Â Â Â Â Â Â\
>> + Â Â if (!once) { Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â\
>> + Â Â Â Â Â Â once = 1; Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â \
>> + Â Â Â Â Â Â dev_dbg(dev, pr_fmt(fmt), ##__VA_ARGS__); Â Â Â \
>> + Â Â }
>
> Why do you need this?
>
This devfreq_do is going to be called periodically; thus, I want to
print a message if there is an error, but not too many messages with
the repeated calls.
Besides, I'd change the macro like this:
#define dev_dbg_once(dev, fmt, ...) Â Â Â Â Â Â Â Â Â Â Â Â Â\
{ \
static int once; \
  if (!once) {                      Â\
      once = 1;                    \
      dev_dbg(dev, pr_fmt(fmt), ##__VA_ARGS__);    \
} \
  }
so that "static int once;" in functions can be removed.
>> + Â Â static int once;
>
> No, please. ÂEither make it a part of the macro, or remove the macro
> entirely.
>
>> +
>> + Â Â err = devfreq->governor->get_target_freq(devfreq, &freq);
>> + Â Â if (err) {
>> + Â Â Â Â Â Â dev_dbg_once(devfreq->dev, "%s: get_target_freq error(%d)\n",
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â__func__, err);
>> + Â Â Â Â Â Â return err;
>> + Â Â }
>> +
>> + Â Â opp = opp_find_freq_ceil(devfreq->dev, &freq);
>> + Â Â if (opp == ERR_PTR(-ENODEV))
>> + Â Â Â Â Â Â opp = opp_find_freq_floor(devfreq->dev, &freq);
>> +
>> + Â Â if (IS_ERR(opp)) {
>> + Â Â Â Â Â Â dev_dbg_once(devfreq->dev, "%s: Cannot find opp with %luHz.\n",
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â__func__, freq);
>> + Â Â Â Â Â Â return PTR_ERR(opp);
>> + Â Â }
>> +
>> + Â Â freq = opp_get_freq(opp);
>> + Â Â if (devfreq->previous_freq != freq) {
>> + Â Â Â Â Â Â err = devfreq->profile->target(devfreq->dev, opp);
>> + Â Â Â Â Â Â if (!err)
>> + Â Â Â Â Â Â Â Â Â Â devfreq->previous_freq = freq;
>
> I'd do
>
> Â Âif (devfreq->previous_freq == freq)
> Â Â Â Âreturn 0;
>
> Â Âerr = devfreq->profile->target(devfreq->dev, opp);
> Â Âif (err) {
> Â Â Â print message or something;
> Â Â Â return err;
> Â Â}
>
> Â Âdevfreq->previous_freq = freq;
> Â Âreturn 0;
>
Good. That looks more pretty. I'll take that form.
>> + Â Â }
>> +
>> + Â Â if (err)
>> + Â Â Â Â Â Â dev_dbg_once(devfreq->dev, "%s: Cannot set %luHz/%luuV\n",
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â__func__, opp_get_freq(opp), opp_get_voltage(opp));
>> + Â Â return err;
>> +}
>> +
>> +/**
>> + * devfreq_monitor() - Regularly run devfreq_do() and support device DEVFREQ tickle.
>> + * @work: the work struct used to run devfreq_monitor periodically.
>> + */
>> +static void devfreq_monitor(struct work_struct *work)
>> +{
>> + Â Â struct devfreq *devfreq;
>> + Â Â int error;
>> + Â Â int reserved = 0;
>> + Â Â static int once;
>> +
>> + Â Â mutex_lock(&devfreq_list_lock);
>> +
>> + Â Â list_for_each_entry(devfreq, &devfreq_list, node) {
>> + Â Â Â Â Â Â if (devfreq->next_polling == 0)
>> + Â Â Â Â Â Â Â Â Â Â continue;
>> +
>> + Â Â Â Â Â Â reserved++;
>
> Why is the variable called 'reserved'?
The variable reserves the next devfreq_monitor call. If there is no
one reserves, we stop monitoring.
However, I've found a bug in this routine (not affecting "reserved++"
directly though) that a tickled device without polling governor may
not return to its original frequency. It is easily addressed by adding
one more condition to the if statement before reserved++;. I'll fix
that in the next revision.
>
>> +
>> + Â Â Â Â Â Â if (devfreq->tickle) {
>> + Â Â Â Â Â Â Â Â Â Â devfreq->tickle--;
>> + Â Â Â Â Â Â Â Â Â Â continue;
>> + Â Â Â Â Â Â }
>
> I'd put a coment above that explaining what's going on here.
Ok.
>
>> + Â Â Â Â Â Â if (devfreq->next_polling == 1) {
>> + Â Â Â Â Â Â Â Â Â Â error = devfreq_do(devfreq);
>> + Â Â Â Â Â Â Â Â Â Â if (error && !once) {
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â once = 1;
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â dev_err(devfreq->dev, "devfreq_do error(%d)\n",
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â error);
>
> Hmm. ÂI'm not sure, but wouldn't it make sense to drop the device from
> the list if devfreq_do() returns error code?
Umm... yeah.. that option (calling devfreq_remove_device() for errors)
is also possible, which will also remove the need for the macro you've
mentioned.
However, when the error is temporary or the device has blocked
changing frequencies temporarily from target callback or governor, it
could be not so desirable.
So, I'm considering to call devfreq_remove_device() at error if the
error is not "EAGAIN". That will also remove the need for the macro
and debug messages above. How about that?
[]
>> +int devfreq_update(struct device *dev, bool may_not_exist)
>> +{
>> + Â Â struct devfreq *devfreq;
>> + Â Â int err = 0;
>> +
>> + Â Â mutex_lock(&devfreq_list_lock);
>> +
>> + Â Â devfreq = find_device_devfreq(dev);
>> + Â Â if (IS_ERR(devfreq)) {
>> + Â Â Â Â Â Â if (may_not_exist && PTR_ERR(devfreq) == -EINVAL)
>> + Â Â Â Â Â Â Â Â Â Â goto out;
>> +
>
> This is kind of strange. ÂWhy don't you simply pass -ENODEV to the caller
> so that it can decide?
Ewww.. Indeed. I'll remove that lines.
>
>> + Â Â Â Â Â Â err = PTR_ERR(devfreq);
>> + Â Â Â Â Â Â goto out;
>> + Â Â }
>> +
>> + Â Â if (devfreq->tickle) {
>> + Â Â Â Â Â Â unsigned long freq = devfreq->profile->max_freq;
>> + Â Â Â Â Â Â struct opp *opp = opp_find_freq_floor(devfreq->dev, &freq);
>> +
>> + Â Â Â Â Â Â if (!IS_ERR(opp) && devfreq->previous_freq != freq) {
>> + Â Â Â Â Â Â Â Â Â Â err = devfreq->profile->target(devfreq->dev, opp);
>> + Â Â Â Â Â Â Â Â Â Â if (!err)
>> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â devfreq->previous_freq = opp_get_freq(opp);
>> + Â Â Â Â Â Â }
>
> This looks like a code duplication from devfreq_do(). ÂWould it make sense
> to put it into a separate function?
devfreq_do() does not handle tickling. In devfreq_monitor(), it also
calls devfreq_do only when it is not being tickled.
>
>> + Â Â } else {
>> + Â Â Â Â Â Â err = devfreq_do(devfreq);
>> + Â Â }
>> +
[]
>> +int devfreq_tickle_device(struct device *dev, unsigned long duration_ms)
>> +{
>> + Â Â struct devfreq *devfreq;
>> + Â Â struct opp *opp;
>> + Â Â unsigned long freq;
>> + Â Â int err = 0;
>> +
>> + Â Â mutex_lock(&devfreq_list_lock);
>> + Â Â devfreq = find_device_devfreq(dev);
>
> I think we should return here if the device hasn't been found.
> Do you want to set 'monitoring' unconditionally and schedule the work item
> unconditionally in this function and if so, why not to do that at the
> beginning?
Uhhh.. yes, this part required some cleaning. In fact, the cleaning is
done with the "add sysfs interface" patch, which added sysfs interface
to tickle. I'll restructure the patches and move the cleaned part from
the "add sysfs interface" patch.
[]
>> diff --git a/kernel/power/Kconfig b/kernel/power/Kconfig
>> index 4603f08..e5d2e36 100644
>> --- a/kernel/power/Kconfig
>> +++ b/kernel/power/Kconfig
>> @@ -225,3 +225,28 @@ config PM_OPP
>> Â Â Â Â representing individual voltage domains and provides SOC
>> Â Â Â Â implementations a ready to use framework to manage OPPs.
>> Â Â Â Â For more information, read <file:Documentation/power/opp.txt>
>> +
>> +config PM_DEVFREQ
>> + Â Â bool "Generic Dynamic Voltage and Frequency Scaling (DVFS) Framework"
>> + Â Â depends on PM_OPP
>
> This assumes the user will know if his/her platform uses that code.
> It may be a good idea to make it depend on a user-invisible option that
> can be selected by the platform.
I think that like CPUFREQ, users will want to enable and disable
DEVFREQ feature by choice although they cannot choose the governor
directly. I'm also considering to allow users to set governors
forcibly and globally at menuconfig (like CPUFREQ does). With CPUFREQ,
such options helped a lot in troubleshooting of CPU related issues.
Do you think it'd be better to have DEVFREQ enabled unconditionally
(if PM_OPP is available) nonetheless?
>>
>
> Thanks,
> Rafael
>
Cheers!
- MyungJoo
--
MyungJoo Ham (íëì), Ph.D.
Mobile Software Platform Lab,
Digital Media and Communications (DMC) Business
Samsung Electronics
cell: 82-10-6714-2858
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/