Re: [PATCH] Revert "clk: fix __clk_init_parent() for single parent clocks"

From: Jerome Brunet
Date: Wed Dec 05 2018 - 12:20:18 EST


On Tue, 2018-12-04 at 23:54 -0800, Stephen Boyd wrote:
> Quoting Jerome Brunet (2018-12-04 11:51:17)
> > On Tue, 2018-12-04 at 10:05 -0800, Stephen Boyd wrote:
> > > Quoting Jerome Brunet (2018-12-04 08:32:57)
> > > > This reverts commit 2430a94d1e719b7b4af2a65b781a4c036eb22e64.
> > > >
> > > > From the original commit message:
> > > > It turned out a problem because there are some single-parent clocks
> > > > that implement .get_parent() callback and return non-zero index.
> > > > The SOCFPGA clock is the case; the commit broke the SOCFPGA boards.
> > > >
> > > > It is wrong for a clock to return an index >= num_parents. CCF checks
> > > > for this condition in clk_core_get_parent_by_index(). This function
> > > > sets
> > > > the parent to NULL if index is incoherent, which seems like the only
> > > > sane choice.
> > > >
> > > > commit 2430a94d1e71 ("clk: fix __clk_init_parent() for single parent
> > > > clocks")
> > > > appears to be a work around installed in the core framework for a
> > > > problem
> > > > that is platform specific, and should probably be fixed in the
> > > > platform
> > > > code.
> > >
> > > Ouch. I see that I even pointed that out in 2016 but never got a reply
> > > or a fix patch[1].
> > >
> > > > Further more, it introduces a problem in a corner case of the mux
> > > > clock.
> > > > Take mux with multiple parents, but only one is known, the rest being
> > > > undocumented. The register reset has one of unknown parent set.
> > > >
> > > > Before commit 2430a94d1e71 ("clk: fix __clk_init_parent() for single
> > > > parent clocks"):
> > > > * get_parent() is called, register is read and give an unknown index.
> > > > -> the mux is orphaned.
> > > > * a call to set_rate() will reparent the mux to the only known
> > > > parent.
> > > >
> > > > With commit 2430a94d1e71 ("clk: fix __clk_init_parent() for single
> > > > parent
> > > > clocks"):
> > > > * the register is never read.
> > > > * parent is wrongly assumed to be the only known one.
> > > > As a consequence, all the calculation deriving from the mux will be
> > > > wrong
> > > > * Since we believe the only know parent to be set, set_parent() won't
> > > > ever be called and the register won't be set with the correct
> > > > value.
> > >
> > > Isn't this the broken bad case all over again? Why register a clk as a
> > > mux and then only say it has one parent?
> >
> > I understand it is a bit odd but as I explained it is a corner case.
> >
> > We are really trying to drive a mux here, applying a values will change
> > the
> > clock signal we get. Documentation being what it is, we only know one the
> > parent. The other parent could anything or maybe not connected at all, who
> > know. That is not the important part actually
> >
> > If such mux was already set to the known entry by default, it would be OK
> > to
> > ignore it. But if it is not, then we CCF to realise that and change the
> > setting accordingly.
> >
> > This the case of the 'ao_cts_cec' clock in the following patch:
> > https://lore.kernel.org/patchwork/patch/1021028/
> >
> > by default the value in the register is 0, but the only one that makes
> > sense
> > for us is 1.
>
> Ok. Thanks for the explanation. What do you do to fix this problem in
> the 'ao_cts_cec' clk implementation?

IMO, there is nothing to fix, the implementation is correct.

> Sounds like you just rely on
> clk_set_rate() to fix this all up for you when the consumer changes the
> rate?

To setup the rate as the customer expect, yes.

>
> If we have something that is default parented to something we're not
> telling the framework about for some reason then one solution would be
> to have some init code reparent the clk in hardware before registering
> it with the framework. Basically "fix up" the clk tree in hardware to be
> consistent with what software can reason about.

Should we really ? Something the framework does not know about is not
necessarily something wrong.

It would best if we could model the tree completly, but most of the time, we
just approximate it the best we can.

The framework just knows how to setup rates, it has no idea what rate needs to
be set or when - And I think it is best if does not make any assumption about
that.

Pushing it a bit, this unknown parent might important to the boot sequence.
How would you know when it is safe to change the setting ?
What would change it to ? It Seems obvious when there is only one known
parent, but what if there two ? which one do you pick ? Is it really the role
of CCF to make this kind of choice ?

>
> This also reminds me of some audio clks I've seen before where the
> default parent is some external pin and it can be reparented to an
> internal clk with clk_set_rate/parent. In that case, I think we forced
> the parent over to the internal clk before registering so that it would
> always get parented properly in the framework. Right now, this is going
> to cause problems for plans to probe defer clk_get() on orphan clks.

clk_get() on orphaned clock should probably return an error if it is not the
case already ? or 0 maybe ?

>
> Maybe this could work better if we allowed
> 'assigned-clock-parents/rates' to reparent clks regardless of orphan
> status and/or had some flag that could be set on clks to indicate that
> we're OK if clk_get() is called on it when it's an orphan because this
> isn't a problem?

Not sure I understand your point here.

>
> Or we can make the defer on orphan code only defer if the clk has a
> single parent and it's an orphan and return the clk if there is at least
> one parent of the clk that has been registered and isn't marked as an
> orphan. That would need to be recursively applied, so I guess we would
> update our orphan marking code to indicate that clk_get() on the clk
> should probe defer or not instead of indicating the clk's orphan status.
> Doing this would also side-step the problem Rockchip was having where
> certain parents never appeared but the clk was parented to it in
> hardware, essentially blocking the clk from ever being touched by
> consumers.

... and now you lost me completly :)

>
> On the other hand, not having the clk there causes us to do a global
> search of the clk name space each time we look for parents by the
> "unknown" index, which in the Rockchip case was quite often because the
> clk was changing between two other parents with a third one that never
> got registered. So it may be better to just register an "unknown" clk as
> a root clk with a rate of 0 that you can then hook everything up to that
> is hardware reset to this unknown input. That way nothing is orphaned
> and everyone is happy. We could do this for the audio case I talked
> about earlier too so when the external pin is left unconnected we could
> register some 'unconnected' clk and parent the audio clk to it.

We have sometimes a few orphans in the amlogic clock tree, it does not seems
to a problem.

I don't really understand the benefit of having a fake clock that would adopt
all the orphan ? You guess there is problem I'm not aware of ...

>
> > > > Signed-off-by: Jerome Brunet <jbrunet@xxxxxxxxxxxx>
> > > > ---
> > >
> > > Is this related to the other patch you sent? Can you send series for
> > > related patches please?
> > >
> > > [1] https://lkml.kernel.org/r/20160209181833.GA24167@xxxxxxxxxxxxxx
> >
> > Actually I was intially doing a series, and stopped when my cover letter
> > started with "those are two unrelated patches ..." ;)
> >
> > I found these things while debugging the same thing but there is no deps
> > between them.
> >
>
> Ok.