Re: [PATCHv4 1/8] devfreq: event: Add new devfreq_event class to provide basic data for devfreq governor

From: MyungJoo Ham
Date: Thu Dec 18 2014 - 01:24:35 EST


Hi Chanwoo,

I love the idea and I now have a little mechanical issues in your code.

> ---
> drivers/devfreq/Kconfig | 2 +
> drivers/devfreq/Makefile | 5 +-
> drivers/devfreq/devfreq-event.c | 449 ++++++++++++++++++++++++++++++++++++++++
> drivers/devfreq/event/Makefile | 1 +
> include/linux/devfreq.h | 160 ++++++++++++++
> 5 files changed, 616 insertions(+), 1 deletion(-)
> create mode 100644 drivers/devfreq/devfreq-event.c
> create mode 100644 drivers/devfreq/event/Makefile
>
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index faf4e70..4d15b62 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -87,4 +87,6 @@ config ARM_EXYNOS5_BUS_DEVFREQ
> It reads PPMU counters of memory controllers and adjusts the
> operating frequencies and voltages with OPP support.
>
> +comment "DEVFREQ Event Drivers"
> +
> endif # PM_DEVFREQ
> diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
> index 16138c9..a1ffabe 100644
> --- a/drivers/devfreq/Makefile
> +++ b/drivers/devfreq/Makefile
> @@ -1,4 +1,4 @@
> -obj-$(CONFIG_PM_DEVFREQ) += devfreq.o
> +obj-$(CONFIG_PM_DEVFREQ) += devfreq.o devfreq-event.o
> obj-$(CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND) += governor_simpleondemand.o
> obj-$(CONFIG_DEVFREQ_GOV_PERFORMANCE) += governor_performance.o
> obj-$(CONFIG_DEVFREQ_GOV_POWERSAVE) += governor_powersave.o
> @@ -7,3 +7,6 @@ obj-$(CONFIG_DEVFREQ_GOV_USERSPACE) += governor_userspace.o
> # DEVFREQ Drivers
> obj-$(CONFIG_ARM_EXYNOS4_BUS_DEVFREQ) += exynos/
> obj-$(CONFIG_ARM_EXYNOS5_BUS_DEVFREQ) += exynos/
> +
> +# DEVFREQ Event Drivers
> +obj-$(CONFIG_PM_DEVFREQ) += event/
>

It looks getting mature fast.
However, I would like to suggest you to

allow not to compile devfreq-event.c and not include its compiled object
if devfreq.c is required but devfreq-event.c is not required.
(e.g., add CONFIG_PM_DEVFREQ_EVENT and let it be enabled when needed)
just a little concern for lightweight devices.
(this change might require a bit more work on the header as well)
- Or do you think devfreq-event.c will become almost mandatory for
most devfreq drivers?


[snip]

> diff --git a/drivers/devfreq/devfreq-event.c b/drivers/devfreq/devfreq-event.c
> new file mode 100644
> index 0000000..0e1948e
> --- /dev/null
> +++ b/drivers/devfreq/devfreq-event.c
> @@ -0,0 +1,449 @@
> +/*
> + * devfreq-event: Generic DEVFREQ Event class driver

DEVFREQ is a generic DVFS mechanism (or subsystem).

Plus, I thought devfreq-event is considered to be a "framework"
for devfreq event class drivers. Am I mistaken?

[snip]

> +struct devfreq_event_dev *devfreq_event_add_edev(struct device *dev,
> + struct devfreq_event_desc *desc)
> +{
> + struct devfreq_event_dev *edev;
> + static atomic_t event_no = ATOMIC_INIT(0);
> + int ret;
> +
> + if (!dev || !desc)
> + return ERR_PTR(-EINVAL);
> +
> + if (!desc->name || !desc->ops)
> + return ERR_PTR(-EINVAL);
> +
> + if (!desc->ops->set_event || !desc->ops->get_event)
> + return ERR_PTR(-EINVAL);
> +
> + edev = devm_kzalloc(dev, sizeof(*edev), GFP_KERNEL);
> + if (!edev)
> + return ERR_PTR(-ENOMEM);
> +
> + mutex_lock(&devfreq_event_list_lock);

You seem to lock that global lock too long. That lock is only required
while you operate the list. The data to be protected by this mutex is
devfreq_event_list. Until the new entry is added to the list, the new
entry is free from protection. (may be delayed right before list_add)

> + mutex_init(&edev->lock);
> + edev->desc = desc;
> + edev->dev.parent = dev;
> + edev->dev.class = devfreq_event_class;
> + edev->dev.release = devfreq_event_release_edev;
> +
> + dev_set_name(&edev->dev, "event.%d", atomic_inc_return(&event_no) - 1);
> + ret = device_register(&edev->dev);
> + if (ret < 0) {
> + put_device(&edev->dev);
> + mutex_unlock(&devfreq_event_list_lock);
> + return ERR_PTR(ret);
> + }
> + dev_set_drvdata(&edev->dev, edev);
> +
> + INIT_LIST_HEAD(&edev->node);
> + list_add(&edev->node, &devfreq_event_list);
> + mutex_unlock(&devfreq_event_list_lock);
> +
> + return edev;
> +}



[snip / reversed maybe.. sorry]

> +/**
> + * devfreq_event_is_enabled() - Check whether devfreq-event dev is enabled or
> + * not.
> + * @edev : the devfreq-event device
> + *
> + * Note that this function check whether devfreq-event dev is enabled or not.
> + * If return true, the devfreq-event dev is enabeld. If return false, the
> + * devfreq-event dev is disabled.
> + */
> +bool devfreq_event_is_enabled(struct devfreq_event_dev *edev)
> +{
> + bool enabled = false;
> +
> + if (!edev || !edev->desc)
> + return enabled;
> +
> + mutex_lock(&edev->lock);
> +
> + if (edev->enable_count > 0)
> + enabled = true;
> +
> + if (edev->desc->ops && edev->desc->ops->is_enabled)
> + enabled |= edev->desc->ops->is_enabled(edev);

What does it mean when enabled_count > 0 and ops->is_enabled() is false? or..
What does it mean when enabled_count = 0 and ops->is_enabled() is true?

If you do enable_count in the subsystem, why would we rely on
ops->is_enabled()? Are you assuming that a device MAY turn itself off
without any kernel control (ops->disable()) and it is still a correct
behabior?

> +
> + mutex_unlock(&edev->lock);
> +
> + return enabled;
> +}
> +EXPORT_SYMBOL_GPL(devfreq_event_is_enabled);


Cheers,
MyungJoo