Re: [PATCH] Revert "clk: Fix invalid execution of clk_set_rate"

From: Manivannan Sadhasivam
Date: Tue Dec 03 2024 - 04:23:34 EST


On Tue, Dec 03, 2024 at 09:25:01AM +0100, Johan Hovold wrote:
> [ +CC: Viresh and Sudeep ]
>
> On Mon, Dec 02, 2024 at 05:20:06PM -0800, Stephen Boyd wrote:
> > Quoting Johan Hovold (2024-12-02 02:06:21)
> > > This reverts commit 25f1c96a0e841013647d788d4598e364e5c2ebb7.
> > >
> > > The offending commit results in errors like
> > >
> > > cpu cpu0: _opp_config_clk_single: failed to set clock rate: -22
> > >
> > > spamming the logs on the Lenovo ThinkPad X13s and other Qualcomm
> > > machines when cpufreq tries to update the CPUFreq HW Engine clocks.
> > >
> > > As mentioned in commit 4370232c727b ("cpufreq: qcom-hw: Add CPU clock
> > > provider support"):
> > >
> > > [T]he frequency supplied by the driver is the actual frequency
> > > that comes out of the EPSS/OSM block after the DCVS operation.
> > > This frequency is not same as what the CPUFreq framework has set
> > > but it is the one that gets supplied to the CPUs after
> > > throttling by LMh.
> > >
> > > which seems to suggest that the driver relies on the previous behaviour
> > > of clk_set_rate().
> >
> > I don't understand why a clk provider is needed there. Is anyone looking
> > into the real problem?
>
> I mentioned this to Mani yesterday, but I'm not sure if he has had time
> to look into it yet. And I forgot to CC Viresh who was involved in
> implementing this. There is comment of his in the thread where this
> feature was added:
>
> Most likely no one will ever do clk_set_rate() on this new
> clock, which is fine, though OPP core will likely do
> clk_get_rate() here.
>
> which may suggest that some underlying assumption has changed. [1]
>

I just looked into the issue this morning. The commit that triggered the errors
seem to be doing the right thing (although the commit message was a bit hard to
understand), but the problem is this check which gets triggered now:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/clk.c?h=v6.13-rc1#n2319

Since the qcom-cpufreq* clocks doesn't have parents now (they should've been
defined anyway) and there is no CLK_SET_RATE_PARENT flag set, the check returns
NULL for the 'top' clock. Then clk_core_set_rate_nolock() returns -EINVAL,
causing the reported error.

But I don't quite understand why clk_core_set_rate_nolock() fails if there is no
parent or CLK_SET_RATE_PARENT is not set. The API is supposed to set the rate of
the passed clock irrespective of the parent. Propagating the rate change to
parent is not strictly needed and doesn't make sense if the parent is a fixed
clock like XO.

Stephen, thoughts?

- Mani

--
மணிவண்ணன் சதாசிவம்