Re: [PATCH v2] clk: mediatek: clk-mux: Add .determine_rate() callback

From: Stephen Boyd
Date: Thu Oct 27 2022 - 20:31:04 EST


Quoting Maxime Ripard (2022-10-18 00:06:53)
>
> We spent some time with Angelo yesterday debugging this, and it's still
> not clear to me what happens.
>
> He provided a significant amount of traces, and we first checked that
> the round_rate part of set_rate was returning something meaningful, and
> it does. The rate is fine, the parent is too, everything's great.
>
> Next we checked the decisions by clk_calc_new_rates, and it does return
> the proper top clock, and its proper rate.
>
> Finally, we hooked into clk_change_rate() to see what kind of decision
> it was enforcing, and it seems to be ok as well. It doesn't change
> parent, and it sets the proper rate, in both cases.
>
> There's still one thing we haven't checked: one of the clock in the tree
> (the parent of the one we want to change the rate on, and it has
> SET_RATE_PARENT) has a notifier. As we've had a bug recently over this
> I've not ruled out that this could be a similar bug.
>
> I don't really think it is though, since the notifier callback doesn't
> use the data provided by the framework:
> https://elixir.bootlin.com/linux/v6.1-rc1/source/drivers/clk/mediatek/clk-mux.c#L279
>
> I've pushed a branch for Angelo to test, just to confirm.
>
> So... yeah. I can't explain the regression at this point. Do you have an
> idea?

I don't really know. If the removal of WARN_ON() helps then it sounds
like a console related problem where we hang the system trying to print
the warning to the console. Did you try replacing that case with a
trace_printk()?

>
> The good news is, since you merged this patch the regression is
> invisible now to that platform. We still could encounter it on another
> platform, but maybe it will also have a more obvious setup to replicate?
>

I guess that's good news :-/