Re: [RFC PATCH v2 4/6] PM: opp: allow control of multiple clocks

From: Krzysztof Kozlowski
Date: Mon May 09 2022 - 06:43:57 EST


On 25/04/2022 09:27, Viresh Kumar wrote:
> On 11-04-22, 17:43, Krzysztof Kozlowski wrote:
>> Devices might need to control several clocks when scaling the frequency
>> and voltage. Example is the Universal Flash Storage (UFS) which scales
>> several independent clocks with change of performance levels.
>>
>> Add parsing of multiple clocks and clock names
>
> This part is fine, the OPP core should be able to do this.

Sorry for late reply, I think I my filters archived it or I missed it.

>
>> and scale all of them,
>
> This is tricky as the OPP core can't really assume the order in which the clocks
> needs to be programmed. We had the same problem with multiple regulators and the
> same is left for drivers to do via the custom-api.
>
> Either we can take the same route here, and let platforms add their own OPP
> drivers which can handle this, Or hide this all behind a basic device clock's
> driver, which you get with clk_get(dev, NULL).

For my use case, the order of scaling will be the same as in previous
implementation, because UFS drivers just got bunch of clocks with
freq-table-hz property and were scaling in DT order.

If drivers need something better, they can always provide custom-opp
thus replacing my method. My implementation here does not restrict them.

For the drivers where the order does not matter, why forcing each driver
to provide its own implementation of clock scaling? Isn't shared generic
PM OPP code a way to remove code duplication?

>
>> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
>
>> +static int _generic_set_opp_clks_only(struct device *dev,
>> + struct opp_table *opp_table,
>> + struct dev_pm_opp *opp)
>> +{
>> + int i, ret;
>> +
>> + if (!opp_table->clks)
>> + return 0;
>> +
>> + for (i = 0; i < opp_table->clk_count; i++) {
>> + if (opp->rates[i]) {
>
> This should mean that we can disable that clock and it isn't required.

No, it does not mean that. The DT might provide several clocks which
only some are important for frequency scaling. All others just need to
be enabled.

Maybe you prefer to skip getting such clocks in PM OPP?

>
>> + ret = _generic_set_opp_clk_only(dev, opp_table->clks[i],
>> + opp->rates[i]);
>> + if (ret) {
>> + dev_err(dev, "%s: failed to set clock %pC rate: %d\n",
>> + __func__, opp_table->clks[i], ret);
>> + return ret;
>> + }
>> + }
>> + }
>> +
>> + return 0;
>> +}
>
> As said earlier, this won't work in the core.
>
>> +
>> static int _generic_set_opp_regulator(struct opp_table *opp_table,
>> struct device *dev,
>> struct dev_pm_opp *opp,
>> @@ -796,7 +835,7 @@ static int _generic_set_opp_regulator(struct opp_table *opp_table,
>> }
>>
>> /* Change frequency */
>> - ret = _generic_set_opp_clk_only(dev, opp_table->clk, freq);
>> + ret = _generic_set_opp_clks_only(dev, opp_table, opp);
>> if (ret)
>> goto restore_voltage;
>>
>> @@ -820,7 +859,7 @@ static int _generic_set_opp_regulator(struct opp_table *opp_table,
>> return 0;
>>
>> restore_freq:
>> - if (_generic_set_opp_clk_only(dev, opp_table->clk, old_opp->rate))
>> + if (_generic_set_opp_clks_only(dev, opp_table, old_opp))
>> dev_err(dev, "%s: failed to restore old-freq (%lu Hz)\n",
>> __func__, old_opp->rate);
>> restore_voltage:
>> @@ -880,7 +919,7 @@ static int _set_opp_custom(const struct opp_table *opp_table,
>
> This is where we can handle it in your case, if you don't want to hide it behind
> a clk driver.
>
>> }
>>
>> data->regulators = opp_table->regulators;
>> - data->clk = opp_table->clk;
>> + data->clk = (opp_table->clks ? opp_table->clks[0] : NULL);
>> data->dev = dev;
>> data->old_opp.rate = old_opp->rate;
>> data->new_opp.rate = freq;
>> @@ -969,8 +1008,8 @@ static void _find_current_opp(struct device *dev, struct opp_table *opp_table)
>
> I think this routine breaks as soon as we add support for multiple clocks.
> clks[0]'s frequency can be same for multiple OPPs and this won't get you the
> right OPP then.

I don't think so and this was raised also by Stephen - only the first
clock is considered the one used for all PM OPP frequency operations,
like get ceil/floor.

The assumption (which might need better documentation) is that first
clock frequency is the main one:
1. It is still in opp->rate field, so it is used everywhere when OPPs
are compared/checked for rates.
1. Usually is used also in opp-table nodes names.

The logical explanation is that devices has some main operating
frequency, e.g. the core clock, and this determines the performance. In
the same time such device might not be able to scale this one core clock
independently from others, therefore this set of patches.

>
>> struct dev_pm_opp *opp = ERR_PTR(-ENODEV);
>> unsigned long freq;
>>
>> - if (!IS_ERR(opp_table->clk)) {
>> - freq = clk_get_rate(opp_table->clk);
>> + if (opp_table->clks && !IS_ERR(opp_table->clks[0])) {
>> + freq = clk_get_rate(opp_table->clks[0]);
>> opp = _find_freq_ceil(opp_table, &freq);
>> }
>>
>> @@ -1070,7 +1109,7 @@ static int _set_opp(struct device *dev, struct opp_table *opp_table,
>> scaling_down);
>> } else {
>> /* Only frequency scaling */
>> - ret = _generic_set_opp_clk_only(dev, opp_table->clk, freq);
>> + ret = _generic_set_opp_clks_only(dev, opp_table, opp);
>> }
>>
>> if (ret)
>> @@ -1135,11 +1174,15 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
>
> This should have a BUG or WARN _ON() now if clock count is more than one. This
> routine can't be called unless custom handler is available.
>
> I skipped rest of the code as we need to work/decide on the design first.



Best regards,
Krzysztof