Re: [PATCH 0/1] clk: Meson8/8b/8m2: fix the mali clock flags

From: Jerome Brunet
Date: Mon Dec 16 2019 - 14:17:27 EST



On Mon 16 Dec 2019 at 18:50, Stephen Boyd <sboyd@xxxxxxxxxx> wrote:

> Quoting Jerome Brunet (2019-12-16 01:13:31)
>>
>> On Sun 15 Dec 2019 at 22:01, Martin Blumenstingl <martin.blumenstingl@xxxxxxxxxxxxxx> wrote:
>>
>> > While playing with devfreq support for the lima driver I experienced
>> > sporadic (random) system lockups. It turned out that this was in
>> > certain cases when changing the mali clock.
>> >
>> > The Amlogic vendor GPU platform driver (which is responsible for
>> > changing the clock frequency) uses the following pattern when updating
>> > the mali clock rate:
>> > - at initialization: initialize the two mali_0 and mali_1 clock trees
>> > with a default setting and enable both clocks
>> > - when changing the clock frequency:
>> > -- set HHI_MALI_CLK_CNTL[31] to temporarily use the mali_1 clock output
>> > -- update the mali_0 clock tree (set the mux, divider, etc.)
>> > -- clear HHI_MALI_CLK_CNTL[31] to temporarily use the mali_0 clock
>> ^ no final setting then ? :P
>> > output again
>> >
>> > With the common clock framework we can even do better:
>> > by setting CLK_SET_RATE_PARENT for the mali_0 and mali_1 output gates
>> ^
>> From your patch, I guess you mean CLK_SET_RATE_GATE ?
>>
>> > we can force the common clock framework to update the "inactive" clock
>> > and then switch to it's output.
>> >
>> > I only tested this patch for a limited time only (approx. 2 hours).
>> > So far I couldn't reproduce the sporadic system lockups with it.
>> > However, broader testing would be great so I would like this to be
>> > applied for -next.
>>
>> CLK_SET_RATE_GATE guarantees that a clock cannot be updated while in
>> use. While it works at your advantage here, I'm not sure CCF guarantees
>> the assumption this implementation is based on. Some explanation below:
>>
>> In your case, if it works as you expect when calling set_rate() on the
>> top clock, it goes as this:
>>
>> - mali0 is use with rate X:
>> - => set_rate(mali_top, Y)
>> - mali0 is in use, cannot change, will round rate Y to X
>> - mali1 is not in use, can provide Y
>> - mali1 is determined to be the new best parent for mali top
>>
>> So far so good.
>>
>> - CCF pick the mali1 subtree
>> *start updating the clock from the root to the leaf*
>>
>> So the mali top mux, which choose between mali0 and mali1, will be
>> *updated last* which crucial to your use case.
>>
>> I just wonder if this crucial part something CCF guarantee and you can
>> rely on it ... or if it might break in the future.
>>
>> Stephen, any thoughts on this ?
>
> We have problems with the order in which we call the set_rate clk_op.
> Sometimes clk providers want us to call from leaf to root but instead we
> call from root to leaf because of implementation reasons. Controlling
> the order in which clk operations are done is an unsolved problem. But
> yes, in the future I'd like to see us introduce the vaporware that is
> coordinated clk rates that would allow clk providers to decide what this
> order should be, instead of having to do this "root-to-leaf" update.
> Doing so would help us with the clk dividers that have some parent
> changing rate that causes the downstream device to be overclocked while
> we change the parent before the divider.
>
> If there are more assumptions like this about how the CCF is implemented
> then we'll have to be extra careful to not disturb the "normal" order of
> operations when introducing something that allows clk providers to
> modify it.

I understand that CCR would, in theory, allow to define that sort of
details. Still defining (and documenting) the default behavior would be
nice.

So the question is:
* Can we rely set_rate() doing a root-to-leaf update until CCR comes
around ?
* If not, for use cases like the one described by Martin, I guess we
are stuck with the notifier ? Or would you have something else to
propose ?

>
> Also, isn't CLK_SET_RATE_GATE broken in the case that clk_set_rate()
> isn't called on that particular clk? I seem to recall that the flag only
> matters when it's applied to the "leaf" or entry point into the CCF from
> a consumer API.

It did but not anymore

> I've wanted to fix that but never gotten around to it.

I fixed that already :P
CLK_SET_RATE_GATE is a special case of clock protect. The clock is
protecting itself so it is going down through the tree.


> The whole flag sort of irks me because I don't understand what consumers
> are supposed to do when this flag is set on a clk. How do they discover
> it?

Actually (ATM) the consumer is not even aware of it. If a clock with
CLK_SET_RATE_GATE is enabled, it will return the current rate to
.round_rate() and .set_rate() ... as if it was fixed.

> They're supposed to "just know" and turn off the clk first and then
> call clk_set_rate()?

ATM, yes ... if CCF cannot switch to another "unlocked" subtree (the
case here)

> Why can't the framework do this all in the clk_set_rate() call?

When there is multiple consumers the behavior would become a bit
difficult to predict and drivers may have troubles anticipating that,
maybe, the clock is locked.

>
>>
>> PS: If CCF does guarantee "root-to-leaf" updates, I think this
>> implementation is a clever trick to solve this usual glitch free clock
>> update issue ... much more elegant that the notifier solution we have
>> been using so far.