Re: [Patch v3 3/6] cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support

From: Thara Gopinath
Date: Fri Jul 09 2021 - 11:37:25 EST




On 7/9/21 2:46 AM, Viresh Kumar wrote:
On 08-07-21, 08:06, Thara Gopinath wrote:
static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
{
struct platform_device *pdev = cpufreq_get_driver_data();
@@ -370,6 +480,10 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
dev_warn(cpu_dev, "failed to enable boost: %d\n", ret);
}
+ ret = qcom_cpufreq_hw_lmh_init(policy, index);

You missed unregistering EM here (which is also missing from exit,
which you need to fix first in a separate patch).

Hi!

So how exactly do you do this? I checked other users of the api and I do not see any free. I would say if needed, it should be a separate patch and outside of this series.


+ if (ret)
+ goto error;
+
return 0;
error:
kfree(data);
@@ -389,6 +503,10 @@ static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
dev_pm_opp_remove_all_dynamic(cpu_dev);
dev_pm_opp_of_cpumask_remove_table(policy->related_cpus);
+ if (data->lmh_dcvs_irq > 0) {
+ devm_free_irq(cpu_dev, data->lmh_dcvs_irq, data);

Why using devm variants here and while requesting the irq ?

+ cancel_delayed_work_sync(&data->lmh_dcvs_poll_work);
+ }

Please move this to qcom_cpufreq_hw_lmh_exit() or something.

Ok.


Now with sequence of disabling interrupt, etc, I see a potential
problem.

CPU0 CPU1

qcom_cpufreq_hw_cpu_exit()
-> devm_free_irq();
qcom_lmh_dcvs_poll()
-> qcom_lmh_dcvs_notify()
-> enable_irq()

-> cancel_delayed_work_sync();


What will happen if enable_irq() gets called after freeing the irq ?
Not sure, but it looks like you will hit this then from manage.c:

WARN(!desc->irq_data.chip, KERN_ERR "enable_irq before
setup/request_irq: irq %u\n", irq))

?

You got a chicken n egg problem :)

Yes indeed! But also it is a very rare chicken and egg problem.
The scenario here is that the cpus are busy and running load causing a thermal overrun and lmh is engaged. At the same time for this issue to be hit the cpu is trying to exit/disable cpufreq. Calling cancel_delayed_work_sync first could solve this issue, right ? cancel_delayed_work_sync guarantees the work not to be pending even if
it requeues itself on return. So once the delayed work is cancelled, the interrupts can be safely disabled. Thoughts ?




--
Warm Regards
Thara (She/Her/Hers)