Re: [PATCH 2/9] clk: Introduce get_parent_hw clk op

From: Stephen Boyd
Date: Tue Feb 05 2019 - 19:01:56 EST


Quoting Jerome Brunet (2019-01-31 10:40:07)
> On Wed, 2019-01-30 at 13:30 -0800, Stephen Boyd wrote:
> > > With this quirk, CCF is making an assumption that might be wrong.
> > >
> > > The quirk is very easy put in the get_parent() callback of the said
> > > driver, or
> > > even better, don't provide the callback if it should not be called.
> > >
> > > I understand the need for a cautious approach. It seems I'm only one with
> > > that
> > > issue right now and since I have a work around, there is no rush. But we
> > > must
> > > have plan to make it right.
> > >
> > > To be clear, I'm not against your new API but I don't think it should be a
> > > reason to keep a broken behavior the framework.
> > >
> >
> > So do you think you can use this new clk_op and ignore the problems with
> > the .get_parent clk op? Putting effort into fixing the .get_parent
> > design isn't very useful from my perspective. There's more than just the
> > problem that we don't call it when .num_parents is 1. There's the
> > inability to return errors without doing weird things to return an index
> > out of range and there isn't any way for us to really know if the clk is
> > an orphan or not. If we can migrate all drivers to use the new clk op
> > then we can fix these problems too, and deprecate and eventually remove
> > the broken by design .get_parent clk op API.
>
> Stephen, I have nothing against your new API, I'm sure it will solve many
> issues
>
> I'm also quite sure that, like round_rate() and determine_rate(), migrating to
> the new API won't happen overnight. We are likely to still see get_parent()
> for a while. I don't understand why we would keep something wrong when it is
> that easy to fix.
>
> I have spent quite sometime debugging this weird behavior of CCF, I'd prefer
> if it can avoided for others.
>
> Yes, fixing the case I reported does not solves all the problem you have
> mentionned. Keeping this bug does not help either, AFAICT.
>
> The fact is that get_parent() already return out of bound values on some
> occasion, and we already have to deal with this when converting the index to
> parent clk_hw pointer. Doing it in the same way when num_parent == 1 does not
> change anything.
>
> I really don't understand why you insist on keeping this special case for
> num_parent == 1, when we know it is not coherent.
>
> Considering, that I already proposed the fix, what is the effort here ?
> If it is fixing the driver that rely this weird thing, I'd be happy to do it.
>
>

Ok. I'm happy to merge your patch to always call the .get_parent clk op
when num_parents > 0, but please fix all the drivers and analyze all the
implementations of .get_parent to make sure that they aren't broken by
the change in behavior. Furthermore, please add a debug/warning message
into the code when .get_parent returns a number outside of the range of
[0, num_parents) so that they can be converted to use .get_parent_hw
instead. Ideally there wouldn't be anything returning a parent index
outside the range of possible parents from .get_parent because this
analysis of drivers would find those implementations and migrate them to
.get_parent_hw instead.

In parallel, I'd like to convert all drivers to use .get_parent_hw
instead of .get_parent and then remove the .get_parent clk op right
away. I'll start a sweep of the users of clk_hw_get_parent_by_index() (I
see 50 calls in the tree right now) and see if I can convert them to
handle errors returned from that API, probably by just continuing and
ignoring errors. I'll start doing the same conversion for .round_rate
and .determine_rate so that we can get rid of that duplicate clk op as
well. Hopefully that's a mostly mechanical conversion.

For now I'll move this patch to the end of this series so that it
doesn't hold things up otherwise.