Re: [PATCH v3 2/3] clk: divider: Switch from .round_rate to .determine_rate by default
From: Stephen Boyd
Date: Thu Jul 01 2021 - 20:53:39 EST
Quoting Guenter Roeck (2021-07-01 14:43:29)
> On 7/1/21 1:57 PM, Martin Blumenstingl wrote:
> > Hi Guenter,
> >
> > On Thu, Jul 1, 2021 at 10:25 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> > [...]
> >> [ 0.000000] [<c07be330>] (clk_core_determine_round_nolock) from [<c07c5480>] (clk_core_set_rate_nolock+0x184/0x294)
> >> [ 0.000000] [<c07c5480>] (clk_core_set_rate_nolock) from [<c07c55c0>] (clk_set_rate+0x30/0x64)
> >> [ 0.000000] [<c07c55c0>] (clk_set_rate) from [<c163c310>] (imx6ul_clocks_init+0x2798/0x2a44)
> >> [ 0.000000] [<c163c310>] (imx6ul_clocks_init) from [<c162a4e4>] (of_clk_init+0x180/0x26c)
> >> [ 0.000000] [<c162a4e4>] (of_clk_init) from [<c1604d34>] (time_init+0x20/0x30)
> >> [ 0.000000] [<c1604d34>] (time_init) from [<c1600e0c>] (start_kernel+0x4c8/0x6cc)
> >> [ 0.000000] [<c1600e0c>] (start_kernel) from [<00000000>] (0x0)
> >> [ 0.000000] Code: bad PC value
> >> [ 0.000000] ---[ end trace 7009a0f298fd39e9 ]---
> >> [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> >>
> >> Bisct points to this patch as culprit. Reverting it fixes the problem.
> > sorry for breaking imx6 - and at the same time: thanks for reporting this!
> >
> > Do you have some additional information about this crash (which clock
> > this relates to, file and line number, etc.)?
> > I am struggling to understand the cause of this NULL dereference
> > My patch doesn't change the clk_core_determine_round_nolock()
> > implementation and the new determine_rate code-path (inside that
> > function) doesn't seem to be more fragile in terms of NULL values
> > compared to the round_rate code-path.
> > Instead I think it's more likely that the problem is somewhere within
> > clk_divider_determine_rate() (or in any helper function it uses), but
> > that doesn't show up in the trace
> >
> > I don't have any imx6 board myself and so far I am unable to reproduce
> > this crash on any hardware I have.
> > However, if it's a problem in my clk-divider.c changes then I'd like
> > to find the cause (ASAP) because possibly more SoCs may be broken...
> >
>
> I don't have such a board either. The problem shows up in my qemu boot tests. See
> https://kerneltests.org/builders/qemu-arm-v7-next/builds/38/steps/qemubuildcommand/logs/stdio
> for an example. The problem reproduces with qemu's mcimx6ul-evk and sabrelite
> emulations.
>
Would be great if we had some kunit tests for the divider code. No doubt
it would have caught this. I think for now I'll just drop this patch
from clk-next. Thanks for testing.