Re: [PATCH] Revert "clk: fix __clk_init_parent() for single parent clocks"

From: Stephen Boyd
Date: Thu Dec 06 2018 - 17:08:25 EST


Sorry this email is long.

TL;DR is that I don't think we need to revert the patch as you suggest.
Instead, we should fix the driver to indicate NULL as the parent name of
the index it tells the framework about.

Quoting Jerome Brunet (2018-12-05 09:20:09)
> On Tue, 2018-12-04 at 23:54 -0800, Stephen Boyd wrote:
> >
> > Ok. Thanks for the explanation. What do you do to fix this problem in
> > the 'ao_cts_cec' clk implementation?
>
> IMO, there is nothing to fix, the implementation is correct.
>
> > Sounds like you just rely on
> > clk_set_rate() to fix this all up for you when the consumer changes the
> > rate?
>
> To setup the rate as the customer expect, yes.

Ok.

>
> >
> > If we have something that is default parented to something we're not
> > telling the framework about for some reason then one solution would be
> > to have some init code reparent the clk in hardware before registering
> > it with the framework. Basically "fix up" the clk tree in hardware to be
> > consistent with what software can reason about.
>
> Should we really ? Something the framework does not know about is not
> necessarily something wrong.
>
> It would best if we could model the tree completly, but most of the time, we
> just approximate it the best we can.
>
> The framework just knows how to setup rates, it has no idea what rate needs to
> be set or when - And I think it is best if does not make any assumption about
> that.
>
> Pushing it a bit, this unknown parent might important to the boot sequence.
> How would you know when it is safe to change the setting ?
> What would change it to ? It Seems obvious when there is only one known
> parent, but what if there two ? which one do you pick ? Is it really the role
> of CCF to make this kind of choice ?

Agreed. We can't mandate this requirement because of what you say. There
are pitfalls such as not always knowing if it's safe to change the clk
tree before clks are registered.

>
> >
> > This also reminds me of some audio clks I've seen before where the
> > default parent is some external pin and it can be reparented to an
> > internal clk with clk_set_rate/parent. In that case, I think we forced
> > the parent over to the internal clk before registering so that it would
> > always get parented properly in the framework. Right now, this is going
> > to cause problems for plans to probe defer clk_get() on orphan clks.
>
> clk_get() on orphaned clock should probably return an error if it is not the
> case already ? or 0 maybe ?

I'd prefer that it return EPROBE_DEFER if the clk is an orphan, except
that it causes problems for this case where muxes move away from
something unknown to something known.

>
> >
> > Maybe this could work better if we allowed
> > 'assigned-clock-parents/rates' to reparent clks regardless of orphan
> > status and/or had some flag that could be set on clks to indicate that
> > we're OK if clk_get() is called on it when it's an orphan because this
> > isn't a problem?
>
> Not sure I understand your point here.

There is a problem with assigned-clock DT properties and orphan clks and
making orphan clks defer driver probes.

>
> >
> > Or we can make the defer on orphan code only defer if the clk has a
> > single parent and it's an orphan and return the clk if there is at least
> > one parent of the clk that has been registered and isn't marked as an
> > orphan. That would need to be recursively applied, so I guess we would
> > update our orphan marking code to indicate that clk_get() on the clk
> > should probe defer or not instead of indicating the clk's orphan status.
> > Doing this would also side-step the problem Rockchip was having where
> > certain parents never appeared but the clk was parented to it in
> > hardware, essentially blocking the clk from ever being touched by
> > consumers.
>
> ... and now you lost me completly :)

Hmm alright.

>
> >
> > On the other hand, not having the clk there causes us to do a global
> > search of the clk name space each time we look for parents by the
> > "unknown" index, which in the Rockchip case was quite often because the
> > clk was changing between two other parents with a third one that never
> > got registered. So it may be better to just register an "unknown" clk as
> > a root clk with a rate of 0 that you can then hook everything up to that
> > is hardware reset to this unknown input. That way nothing is orphaned
> > and everyone is happy. We could do this for the audio case I talked
> > about earlier too so when the external pin is left unconnected we could
> > register some 'unconnected' clk and parent the audio clk to it.
>
> We have sometimes a few orphans in the amlogic clock tree, it does not seems
> to a problem.

I suppose drivers aren't calling clk_set_rate() on the orphaned clks and
experiencing problems? There are problems around clk rates that could
happen if there aren't any proper parents of a mux.

>
> I don't really understand the benefit of having a fake clock that would adopt
> all the orphan ? You guess there is problem I'm not aware of ...

I see three cases that can go wrong if we make orphan clks probe defer
clk_get():

1) NULL as the parent name for some parent index that the hardware
indicates is the current parent of the clk when the clk is registered.

In this case, we need to make clk_get() succeed and not return
-EPROBE_DEFER, so I think we need to make a special "unknown" clk in the
framework so that clks aren't orphaned when the hardware is indicating
the parent is something we don't (and won't) know anything about.

2) get_parent/set_parent on a clk with num_parents == 1

This seems to be how ao_cts_cec is implemented. This hides the fact that
get_parent may return some number above the number of parents and then
has that quietly make the parent of the clk NULL during registration so
that it's orphaned. Then we rely on set_rate() or set_parent() being
called to fix the parent chain at a later time.

That is complicated and non-obvious. I'd rather that we specify NULL as
a parent in the parent_names array if we're going to return that index
to the framework and then have the framework go to case #1. This makes
it easier for the framework to assume that [0, num_parents) is the valid
range of indices returned from the get_parent() op and a number outside
that range is an error, clearing the way for errors to be returned from
get_parent() if it somehow fails.

3) A string as the parent name for some parent index that the hardware
indicates is the current parent of the clk when the clk is registered,
but that parent clk may never be registered with the framework.

I imagine this can happen with an external gpio clk that feeds some SoC
internal clk. That gpio could be supplied by some clk signal that the
board designer decides to send there, and it could also just "magically
appear" by applying some DT overlay. Due to our use of global strings
for parent linking, we don't have a good way to figure out that nothing
is connected to some clock input signal, making this quite a nightmare
to figure out a runtime!

One solution is to replace the parent name with NULL and ignore the
runtime update case. This will work and bring us back to case #1. The
problem is that if it is connected to something like a GPIO and it
defaults to that pin instead of some internal clk we're stuck with an
"orphan" clk that we can't clk_get(). I suppose the driver could read DT
and see if this pin is not connected and then mark this input parent
name as NULL. We can punt on solving the DT overlay problem until later,
because we would need some way to dynamically resolve a new parent of a
clk and update the tree when the DT node is modified to use a new GPIO.