Re: [PATCH 5/5] clk: samsung: gs101: don't mark non-essential clocks as critical
From: André Draszik
Date: Mon Jan 29 2024 - 09:39:00 EST
Hi Sam,
On Fri, 2024-01-26 at 21:30 -0600, Sam Protsenko wrote:
> On Fri, Jan 26, 2024 at 6:37 PM André Draszik <andre.draszik@xxxxxxxxxx> wrote:
>
> >
> > Note that this commit has the side-effect of causing earlycon to stop
> > to work sometime into the boot for two reasons:
> > * peric0_top1_ipclk_0 requires its parent gout_cmu_peric0_ip to be
> > running, but because earlycon doesn't deal with clocks that
> > parent will be disabled when none of the other drivers that
> > actually deal with clocks correctly require it to be running and
> > the real serial driver (which does deal with clocks) hasn't taken
> > over yet
>
> That's weird. Doesn't your bootloader setup serial clocks properly?
> AFAIU, earlycon should rely on everything already configured in
> bootloader.
I tried to explain that above, but let me try again...
The console UART, and I2C bus 8 are on the same cmu_peric0 controller, and
that cmu_peric0 has two clocks coming from cmu_top, ip and bus. For I2C8 & UART
to work, both of these clocks from cmu_top need to to be on as they are the
parent of the i2c8-(ip|pclk) and uart-(ip|pclk) each.
The bootloader leaves those clocks running, yes. So earlycon works (for a
while).
At some point into the boot, one of two things happens:
1) Linux will load the i2c driver. That driver does clock handling
(correctly), it will initialise and then it has nothing to do, therefore it
disables cmu_peric0's i2c8 ip and pclk clocks. Because at that stage nothing
appears to be using the cmu_peric0's ip clock (the real serial driver hasn't
initialised yet), Linux decides to also disable the parent ip clock coming
from cmu_top.
At this stage, the earlycon driver stops working, as the parent ip clock of
the uart ip clock is not running any more. No serial output can be observed
from this stage onwards. I think what is probably happening is that the
console uart FIFO doesn't get emptied anymore, and earlycon will simply wait
forever for space to become available in the FIFO (but I didn't debug this).
Anyway, the boot doesn't progress, the system appears to hang. In any case it's
not usable as we have no other means of using it at this stage (network /
usb / display etc.).
2) Alternatively, the UART driver will load at this stage. Again, it will
tweak the clocks and after probe it will leave its clocks disabled. The
serial console driver hasn't taken over at this stage and earlycon is still
active. Again, the system will hang, because IP and PCLK have been disabled
by the UART driver. Once the serial console is enabled, clocks are being
enabled again, but because earlycon is still waiting for progress, the
boot doesn't progress past disabling ip and pclk. It never gets to enabling
the serial console (re-enabling the clocks).
So in both cases we get some output from earlycon, but the system hangs once
the first consumer driver of an IP attached to cmu_peric0 has completed
probing.
> > * hand-over between earlycon and serial driver appears to be
> > fragile and clocks get enabled and disabled a few times, which
> > also causes register access to hang while earlycon is still
> > active
> > Nonetheless we shouldn't keep these clocks running unconditionally just
> > for earlycon. Clocks should be disabled where possible. If earlycon is
> > required in the future, e.g. for debug, this commit can simply be
> > reverted (locally!).
>
> That sounds... not ideal. The ability to enable earlycon just by
> adding some string to bootargs can be very useful for developers.
> Maybe just make those clocks CLK_IGNORE_UNUSED, if that keeps earlycon
> functional? With corresponding comments of course.
CLK_IGNORE_UNUSED doesn't help in this case, the i2c and uart drivers will load
and probe before earlycon gets disabled and as part of their probing disable
the cmu_top ip clock going to cmu_peric0
If earlycon is not enabled in kernel command line, everything works fine, the
kernel buffers its messages and once the real serial console driver starts,
all messages since boot are being printed.
Other than keeping it as CLK_IS_CRITICAL, there is no way that I can see to
way to make the hand-over from earlycon to the real serial driver work in
all cases.
They are not critical clocks for the system, though, so it's wrong to always
keep them running unconditionally.
We are past a stage where earlycon is generally required.
If it's required for some local development, people can revert this patch locally.
BTW, downstream doesn't suffer from this problem because downstream uses ACG
throughout and clocks are enabled automatically in hardware as required.
Cheers,
Andre'