Re: [PATCH v2] clk: samsung: Prevent potential endless loop in the PLL set_rate ops
From: Tomasz Figa
Date: Tue Aug 11 2020 - 12:53:59 EST
2020年8月11日(火) 18:45 Sylwester Nawrocki <s.nawrocki@xxxxxxxxxxx>:
> Hi Tomasz,
> On 11.08.2020 14:59, Tomasz Figa wrote:
> > 2020年8月11日(火) 13:25 Sylwester Nawrocki <s.nawrocki@xxxxxxxxxxx>:
> >> In the .set_rate callback for some PLLs there is a loop polling state
> >> of the PLL lock bit and it may become an endless loop when something
> >> goes wrong with the PLL. For some PLLs there is already (a duplicated)
> >> code for polling with timeout. This patch replaces that code with
> >> the readl_relaxed_poll_timeout_atomic() macro and moves it to a common
> >> helper function, which is then used for all the PLLs. The downside
> >> of switching to the common macro is that we drop the cpu_relax() call.
> > Tbh. I'm not sure what effect was exactly expected from cpu_relax() in
> > the functions which already had timeout handling. Could someone shed
> > some light on this?
> >> Using a common helper function rather than the macro directly allows
> >> to avoid repeating the error message in the code and to avoid the object
> >> code size increase due to inlining.
> >> Signed-off-by: Sylwester Nawrocki <s.nawrocki@xxxxxxxxxxx>
> >> ---
> >> Changes for v2:
> >> - use common readl_relaxed_poll_timeout_atomic() macro
> >> ---
> >> drivers/clk/samsung/clk-pll.c | 92 +++++++++++++++----------------------------
> >> 1 file changed, 32 insertions(+), 60 deletions(-)
> >> diff --git a/drivers/clk/samsung/clk-pll.c b/drivers/clk/samsung/clk-pll.c
> >> index ac70ad7..c3c1efe 100644
> >> --- a/drivers/clk/samsung/clk-pll.c
> >> +++ b/drivers/clk/samsung/clk-pll.c
> >> @@ -9,13 +9,14 @@
> >> -#define PLL_TIMEOUT_MS 10
> >> +#define PLL_TIMEOUT_US 10000U
> > I'm also wondering if 10ms is the universal value that would cover the
> > oldest PLLs as well, but my loose recollection is that they should
> > still lock much faster than that. Could you double check that in the
> > documentation?
> Thanks for your comments.
> The oldest PLLs have a hard coded 300 us waiting time for PLL lock and
> are not affected by the patch.
> I have checked some of the PLLs and maximum observed lock time was around
> 370 us and most of the time it was just a few us.
> We calculate the lock time in each set_rate op, in the oscillator cycle
> units, as a product of current P divider value and a constant PLL type
> specific LOCK_FACTOR. Maximum possible P value is 64, maximum possible
> LOCK_FACTOR is 3000. Assuming minimum VCO frequency of 24 MHz (which
> I think will usually be much higher than that) maximum lock time
> would be (64 x 3000) / 24 MHz = 8 ms. I think we can leave the current
> 10 ms value.
Sounds good to me. Thanks!
> But there is other issue, it seems we can't really use the ktime API
> in the set_rate callbacks, as these could be called early, before the
> clocksource is initialized and ktime doesn't work yet. Below trace
> is from a dump_stack() added to the samsung_pll_lock_wait() callback.
> The PLL rate setting is triggered by assigned-clock* properties in
> the clock supplier node.
> I think we need to switch to a simple udelay() loop, as is done in
> clk-tegra210 driver for instance.
> [ 0.000000] Hardware name: Samsung Exynos (Flattened Device Tree)
> [ 0.000000] [<c0111e9c>] (unwind_backtrace) from [<c010d0ec>] (show_stack+0x10/0x14)
> [ 0.000000] [<c010d0ec>] (show_stack) from [<c051d890>] (dump_stack+0xac/0xd8)
> [ 0.000000] [<c051d890>] (dump_stack) from [<c0578d94>] (samsung_pll_lock_wait+0x14/0x174)
> [ 0.000000] [<c0578d94>] (samsung_pll_lock_wait) from [<c057319c>] (clk_change_rate+0x1a8/0x8ac)
> [ 0.000000] [<c057319c>] (clk_change_rate) from [<c0573aec>] (clk_core_set_rate_nolock+0x24c/0x268)
> [ 0.000000] [<c0573aec>] (clk_core_set_rate_nolock) from [<c0573b38>] (clk_set_rate+0x30/0x64)
> [ 0.000000] [<c0573b38>] (clk_set_rate) from [<c0577df8>] (of_clk_set_defaults+0x214/0x384)
> [ 0.000000] [<c0577df8>] (of_clk_set_defaults) from [<c0572f34>] (of_clk_add_hw_provider+0x98/0xd8)
> [ 0.000000] [<c0572f34>] (of_clk_add_hw_provider) from [<c1120278>] (samsung_clk_of_add_provider+0x1c/0x30)
> [ 0.000000] [<c1120278>] (samsung_clk_of_add_provider) from [<c1121844>] (exynos5250_clk_of_clk_init_driver+0x1f4/0x240)
> [ 0.000000] [<c1121844>] (exynos5250_clk_of_clk_init_driver) from [<c11200d0>] (of_clk_init+0x16c/0x218)
> [ 0.000000] [<c11200d0>] (of_clk_init) from [<c1104bdc>] (time_init+0x24/0x30)
> [ 0.000000] [<c1104bdc>] (time_init) from [<c1100d20>] (start_kernel+0x3b0/0x520)
Yeah... I should've thought about this. Interestingly enough, some of
the existing implementations in drivers/clk/samsung/clk-pll.c use the
ktime API. I guess they are lucky enough not to be called too early,
i.e. are not needed for the initialization of timers.
> [ 0.000000] [<c1100d20>] (start_kernel) from [<00000000>] (0x0)
> [ 0.000000] samsung_pll_lock_wait: PLL fout_epll, lock time: 0 us, ret: 0
> [ 0.000000] Exynos5250: clock setup completed, armclk=1700000000
> [ 0.000000] Switching to timer-based delay loop, resolution 41ns
> [ 0.000000] clocksource: mct-frc: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns
> [ 0.000003] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns
> [ 0.000032] genirq: irq_chip COMBINER did not update eff. affinity mask of irq 49
> [ 0.000523] arch_timer: cp15 timer(s) running at 24.00MHz (virt).
> [ 0.000536] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x588fe9dc0, max_idle_ns: 440795202592 ns
> [ 0.000551] sched_clock: 56 bits at 24MHz, resolution 41ns, wraps every 4398046511097ns