Re: [PATCH 5/5] clk: samsung: gs101: don't mark non-essential clocks as critical
From: Sam Protsenko
Date: Mon Jan 29 2024 - 14:16:36 EST
On Mon, Jan 29, 2024 at 8:37 AM André Draszik <andre.draszik@xxxxxxxxxx> wrote:
>
> Hi Sam,
>
> On Fri, 2024-01-26 at 21:30 -0600, Sam Protsenko wrote:
> > On Fri, Jan 26, 2024 at 6:37 PM André Draszik <andre.draszik@xxxxxxxxxx> wrote:
> >
> > >
> > > Note that this commit has the side-effect of causing earlycon to stop
> > > to work sometime into the boot for two reasons:
> > > * peric0_top1_ipclk_0 requires its parent gout_cmu_peric0_ip to be
> > > running, but because earlycon doesn't deal with clocks that
> > > parent will be disabled when none of the other drivers that
> > > actually deal with clocks correctly require it to be running and
> > > the real serial driver (which does deal with clocks) hasn't taken
> > > over yet
> >
> > That's weird. Doesn't your bootloader setup serial clocks properly?
> > AFAIU, earlycon should rely on everything already configured in
> > bootloader.
>
> I tried to explain that above, but let me try again...
>
> The console UART, and I2C bus 8 are on the same cmu_peric0 controller, and
> that cmu_peric0 has two clocks coming from cmu_top, ip and bus. For I2C8 & UART
> to work, both of these clocks from cmu_top need to to be on as they are the
> parent of the i2c8-(ip|pclk) and uart-(ip|pclk) each.
>
> The bootloader leaves those clocks running, yes. So earlycon works (for a
> while).
>
> At some point into the boot, one of two things happens:
> 1) Linux will load the i2c driver. That driver does clock handling
> (correctly), it will initialise and then it has nothing to do, therefore it
> disables cmu_peric0's i2c8 ip and pclk clocks. Because at that stage nothing
> appears to be using the cmu_peric0's ip clock (the real serial driver hasn't
> initialised yet), Linux decides to also disable the parent ip clock coming
> from cmu_top.
>
> At this stage, the earlycon driver stops working, as the parent ip clock of
> the uart ip clock is not running any more. No serial output can be observed
> from this stage onwards. I think what is probably happening is that the
> console uart FIFO doesn't get emptied anymore, and earlycon will simply wait
> forever for space to become available in the FIFO (but I didn't debug this).
>
> Anyway, the boot doesn't progress, the system appears to hang. In any case it's
> not usable as we have no other means of using it at this stage (network /
> usb / display etc.).
>
> 2) Alternatively, the UART driver will load at this stage. Again, it will
> tweak the clocks and after probe it will leave its clocks disabled. The
> serial console driver hasn't taken over at this stage and earlycon is still
> active. Again, the system will hang, because IP and PCLK have been disabled
> by the UART driver. Once the serial console is enabled, clocks are being
> enabled again, but because earlycon is still waiting for progress, the
> boot doesn't progress past disabling ip and pclk. It never gets to enabling
> the serial console (re-enabling the clocks).
>
> So in both cases we get some output from earlycon, but the system hangs once
> the first consumer driver of an IP attached to cmu_peric0 has completed
> probing.
>
>
>
> > > * hand-over between earlycon and serial driver appears to be
> > > fragile and clocks get enabled and disabled a few times, which
> > > also causes register access to hang while earlycon is still
> > > active
> > > Nonetheless we shouldn't keep these clocks running unconditionally just
> > > for earlycon. Clocks should be disabled where possible. If earlycon is
> > > required in the future, e.g. for debug, this commit can simply be
> > > reverted (locally!).
> >
> > That sounds... not ideal. The ability to enable earlycon just by
> > adding some string to bootargs can be very useful for developers.
> > Maybe just make those clocks CLK_IGNORE_UNUSED, if that keeps earlycon
> > functional? With corresponding comments of course.
>
> CLK_IGNORE_UNUSED doesn't help in this case, the i2c and uart drivers will load
> and probe before earlycon gets disabled and as part of their probing disable
> the cmu_top ip clock going to cmu_peric0
>
> If earlycon is not enabled in kernel command line, everything works fine, the
> kernel buffers its messages and once the real serial console driver starts,
> all messages since boot are being printed.
>
> Other than keeping it as CLK_IS_CRITICAL, there is no way that I can see to
> way to make the hand-over from earlycon to the real serial driver work in
> all cases.
>
> They are not critical clocks for the system, though, so it's wrong to always
> keep them running unconditionally.
>
> We are past a stage where earlycon is generally required.
>
> If it's required for some local development, people can revert this patch locally.
>
That sounds reasonable. But I wonder if that bit (about making this
clock CLK_IS_CRITICAL to make earlycon functional) can be documented
somewhere. Perhaps in the serial driver (earlycon function), or
somewhere in device tree bindings? Because otherwise it might remain
an arcane knowledge and people won't be able to use earlycon later.
Anyways, for this patch:
Reviewed-by: Sam Protsenko <semen.protsenko@xxxxxxxxxx>
and if you think it makes sense to document the bit above, please do.
>
> BTW, downstream doesn't suffer from this problem because downstream uses ACG
> throughout and clocks are enabled automatically in hardware as required.
>
Yes, using QCH clocks (HWACG) seems like a correct way to fix this,
and would be nice to have otherwise. Alas, it doesn't seems very easy
to implement, and should probably be based on top of regular clock
driver anyway. I thought about it for a while, but never came up with
particular ideas on how to implement HWACG support in Samsung CCF
framework properly.
>
> Cheers,
> Andre'
>