Re: [PATCH v3 1/2] clk: Add generic sync_state callback for disabling unused clocks

From: Saravana Kannan
Date: Fri May 26 2023 - 20:18:51 EST


On Fri, May 12, 2023 at 3:31 AM Abel Vesa <abel.vesa@xxxxxxxxxx> wrote:
>
> On 23-05-11 17:46:16, Saravana Kannan wrote:
> > On Thu, May 11, 2023 at 5:58 AM Abel Vesa <abel.vesa@xxxxxxxxxx> wrote:
> > >
> > > On 23-02-21 11:58:24, Saravana Kannan wrote:
> > > > On Mon, Feb 20, 2023 at 10:47 AM Abel Vesa <abel.vesa@xxxxxxxxxx> wrote:
> > > > >
> > > > > On 23-02-20 09:51:55, Saravana Kannan wrote:
> > > > > > On Mon, Feb 20, 2023 at 8:28 AM Abel Vesa <abel.vesa@xxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On 23-02-20 17:46:36, Abel Vesa wrote:
> > > > > > > > On 23-02-17 21:38:22, Stephen Boyd wrote:
> > > > > > > > > Quoting Abel Vesa (2022-12-27 12:45:27)
> > > > > > > > > > There are unused clocks that need to remain untouched by clk_disable_unused,
> > > > > > > > > > and most likely could be disabled later on sync_state. So provide a generic
> > > > > > > > > > sync_state callback for the clock providers that register such clocks.
> > > > > > > > > > Then, use the same mechanism as clk_disable_unused from that generic
> > > > > > > > > > callback, but pass the device to make sure only the clocks belonging to
> > > > > > > > > > the current clock provider get disabled, if unused. Also, during the
> > > > > > > > > > default clk_disable_unused, if the driver that registered the clock has
> > > > > > > > > > the generic clk_sync_state_disable_unused callback set for sync_state,
> > > > > > > > > > skip disabling its clocks.
> > > > > >
> > > > > > Hi Abel,
> > > > > >
> > > > > > We have the day off today, so I'll respond more later. Also, please cc
> > > > > > me on all sync_state() related patches in the future.
> > > > > >
> > > > >
> > > > > Sure thing.
> > > > >
> > > > > > I haven't taken a close look at your series yet, but at a glance it
> > > > > > seems incomplete.
> > > > > >
> > > > > > Any reason you didn't just try to revive my series[1] or nudge me?
> > > > > > [1]- https://lore.kernel.org/lkml/20210407034456.516204-3-saravanak@xxxxxxxxxx/
> > > > >
> > > > > This patchset is heavily reworked and much more simpler as it relies
> > > > > strictly on the sync_state being registered by the clock provider.
> > > >
> > > > It's simpler because it's not complete. It for sure doesn't handle
> > > > orphan-reparenting. It also doesn't make a lot of sense for only some
> > > > clock providers registering for sync_state(). If CC-A is feeding a
> > > > clock signal that's used as a root for clocks in CC-B, then what
> > > > happens if only CC-B implements sync_state() but CC-A doesn't. The
> > > > clocks from CC-B are still going to turn off when CC-A turns off its
> > > > PLL before CC-B registers.
> > >
> > > I gave your patchset a try and it breaks the uart for qcom platforms.
> > > That is because your patchset enables the clock on __clk_core_init and
> > > does not take into account the fact that 'boot enabled' clocks should be
> > > left untouched.
> >
> > Those are probably just hacks when we didn't have sync_state(). But
> > sure, we can make sure existing drivers aren't broken if the flag is
> > set.
>
> I probably didn't make myself clear enough here. ANY clock that is
> enabled (HW-wise) before the kernel boots should remain AS IS, that is, no writing
> the enable bit, no reparenting, and so on. This rule applies to the clock itself
> and for all of its parents. This is because, for some clocks, writing the
> enable bit might lead to glitches. UART is just one example. So, please, do not
> try enabling such clocks until the consumer driver does so.
>
> >
> > > This also means the orphan-reparenting enabling should
> > > be dropped as well.
> >
> > No, maybe for boot enabled clocks, but not for all clocks in general.
> > You need this for sync_state() to work correctly for clocks left on at
> > boot but "boot enabled" isn't set.
>
> I think you lost me here. What do you mean by 'this'? If you mean the
> enabling of orphan clocks, then the rule above still applies. It
> doesn't matter if the clock is an orphan one or not. It can be orphan
> from linux point of view, but the actual parent (even if it is not
> registered with the linux clock tree) might still be enabled. This means
> the clock itself will be also enabled. And by enabling them when
> registering, we can have glitches. Therefore, please, do not do this
> either.
>
> The registering of a boot enabled clock should not change/override/touch
> the current state of it in any way!
>
> Stephen, can you confirm this as well?
>
> >
> > > As for the second part, related to providers that might not have a
> > > registered sync_state(), your patchset sets the clock core generic
> > > one. This is also wrong because it doesn't take into account the fact
> > > that there might be providers that need to do their own stuff on
> > > sync_state() and should do that by registering their own implementation
> > > of it.
> >
> > Right, in which case, they can set theirs or they get the default one.
>
> I'm still not sure that defaulting to the clk_sync_state callback is a
> good choice here. I have to think some more about what the impact is for
> providers that do not have any sync_state callback registered currently.
>
> >
> > > Therefore, I'll respin your patchset and use only the skipping of
> > > disabling the unused clocks, but I'll drop all the enable on init and orphan
> > > reparenting changes.
> >
> > I think it'll result in a broken patch.
>
> Yep, tried that and it doesn't work. What happened was that, because you
> were enabling the 'boot enabled' clocks when registering them (on __clk_core_init),
> the disabling from the sync state needs to be without dropping the enable/prepare
> counts. This is why I think my patchset here is the best alternative he have
> currently, as it does exactly what it is supposed to do, namingly, to leave
> untouched the boot enabled clocks until sync state and then disabling
> them with via clk_disable_unused_subtree which calls the disable and
> unprepare ops without decrementing the prepare and enable counts.
>
> >
> > Sorry, I've been a bit busy with some other work and I haven't been
> > able to get to the clk_sync_state(). I'll try to rebase it soon and
> > send it out too.
>
> Well, I already did that and I described above why that won't help.

The biggest disconnect is that you seem to think boot enabled clocks
should be untouched until sync_state() is called on them. But that's
not a valid assumption. Boot enabled clocks can have multiple
consumers and one of them might want to change the frequency before
the other one probes. That's perfectly valid. In some cases, we might
need to make sure the clock frequency doesn't go higher than the clock
frequency at boot (we can make that a flag). Actually, even if there's
only one consumer, that consumer might change the clock frequency at
probe -- since sync_state() only comes after the probe(), we need to
make sure we allow reparenting and frequency changes to boot enabled
clocks before sync_state() is called.

Also, consider this example.

PLL1 -> mux1 -> clock_gate1 -> consumer1.
PLL1 -> mux2 -> clock_gate2 -> consumer2.

If I don't do orphan handling/reparenting like I do so in my patch,
PLL1 could get turned off after consumer 1 probes depending on which
clock controller each of those blocks are on.

I'm pretty sure I actually identified this issue when I wrote my patch
by testing it on QC hardware. So this is not some "other" hardware
issue. This actually affects QC hardware too. Maybe you haven't
upstreamed all of your hardware drivers, but this is not some
imaginary scenario.

Your main problem seems to be that your hardware can't handle writing
1 to the enable bit of a clock that's already on. If that's the case,
protect against that inside your driver. I'm even okay with figuring
out a way to try and support that at a framework level. But to say you
don't need to reparenting or the orphan handling is definitely wrong.

-Saravana