Re: next/master bisection: baseline.login on sun8i-h2-plus-orangepi-zero

From: Enric Balletbo Serra
Date: Wed Mar 04 2020 - 04:28:04 EST


Hi all,

Missatge de Enric Balletbo Serra <eballetbo@xxxxxxxxx> del dia dc., 26
de febr. 2020 a les 18:15:
>
> Hi all,
>
> Missatge de Maxime Ripard <maxime@xxxxxxxxxx> del dia dt., 25 de febr.
> 2020 a les 15:34:
> >
> > On Wed, Feb 19, 2020 at 03:49:47PM -0800, Stephen Boyd wrote:
> > > Adding some Allwinner folks. Presumably there is some sort of clk that
> > > is failing to calculate a phase when it gets registered. Maybe that's
> > > because the parent isn't registered yet?
> >
> > It's simpler than that :)
> >
> > > Quoting Guillaume Tucker (2020-02-17 23:45:41)
> > > > Hi Stephen,
> > > >
> > > > Please see the bisection report below about a boot failure.
> > > >
> > > > Reports aren't automatically sent to the public while we're
> > > > trialing new bisection features on kernelci.org but this one
> > > > looks valid.
> > > >
> > > > There's nothing in the serial console log, probably because it's
> > > > crashing too early during boot. I'm not sure if other platforms
> > > > on kernelci.org were hit by this in the same way, it's tricky to
> > > > tell partly because there is no output. It should possible to
> > > > run it again with earlyprintk enabled in BayLibre's test lab
> > > > though.
> > > >
> > > > Thanks,
> > > > Guillaume
> > > >
> > > >
> > > > On 18/02/2020 02:14, kernelci.org bot wrote:
> > > > > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> > > > > * This automated bisection report was sent to you on the basis *
> > > > > * that you may be involved with the breaking commit it has *
> > > > > * found. No manual investigation has been done to verify it, *
> > > > > * and the root cause of the problem may be somewhere else. *
> > > > > * *
> > > > > * If you do send a fix, please include this trailer: *
> > > > > * Reported-by: "kernelci.org bot" <bot@xxxxxxxxxxxx> *
> > > > > * *
> > > > > * Hope this helps! *
> > > > > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> > > > >
> > > > > next/master bisection: baseline.login on sun8i-h2-plus-orangepi-zero
> > > > >
> > > > > Summary:
> > > > > Start: c25a951c50dc Add linux-next specific files for 20200217
> > > > > Plain log: https://storage.kernelci.org//next/master/next-20200217/arm/multi_v7_defconfig/gcc-8/lab-baylibre/baseline-sun8i-h2-plus-orangepi-zero.txt
> > > > > HTML log: https://storage.kernelci.org//next/master/next-20200217/arm/multi_v7_defconfig/gcc-8/lab-baylibre/baseline-sun8i-h2-plus-orangepi-zero.html
> > > > > Result: 2760878662a2 clk: Bail out when calculating phase fails during clk registration
> > > > >
> > > > > Checks:
> > > > > revert: PASS
> > > > > verify: PASS
> > > > >
> > > > > Parameters:
> > > > > Tree: next
> > > > > URL: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> > > > > Branch: master
> > > > > Target: sun8i-h2-plus-orangepi-zero
> > > > > CPU arch: arm
> > > > > Lab: lab-baylibre
> > > > > Compiler: gcc-8
> > > > > Config: multi_v7_defconfig
> > > > > Test case: baseline.login
> > > > >
> > > > > Breaking commit found:
> > > > >
> > > > > -------------------------------------------------------------------------------
> > > > > commit 2760878662a290ac57cff8a5a8d8bda8f4dddc37
> > > > > Author: Stephen Boyd <sboyd@xxxxxxxxxx>
> > > > > Date: Wed Feb 5 15:28:02 2020 -0800
> > > > >
> > > > > clk: Bail out when calculating phase fails during clk registration
> > > > >
> > > > > Bail out of clk registration if we fail to get the phase for a clk that
> > > > > has a clk_ops::get_phase() callback. Print a warning too so that driver
> > > > > authors can easily figure out that some clk is unable to read back phase
> > > > > information at boot.
> > > > >
> > > > > Cc: Douglas Anderson <dianders@xxxxxxxxxxxx>
> > > > > Cc: Heiko Stuebner <heiko@xxxxxxxxx>
> > > > > Suggested-by: Jerome Brunet <jbrunet@xxxxxxxxxxxx>
> > > > > Signed-off-by: Stephen Boyd <sboyd@xxxxxxxxxx>
> > > > > Link: https://lkml.kernel.org/r/20200205232802.29184-5-sboyd@xxxxxxxxxx
> > > > > Acked-by: Jerome Brunet <jbrunet@xxxxxxxxxxxx>
> > > > >
> > > > > diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> > > > > index dc8bdfbd6a0c..ed1797857bae 100644
> > > > > --- a/drivers/clk/clk.c
> > > > > +++ b/drivers/clk/clk.c
> > > > > @@ -3457,7 +3457,12 @@ static int __clk_core_init(struct clk_core *core)
> > > > > * Since a phase is by definition relative to its parent, just
> > > > > * query the current clock phase, or just assume it's in phase.
> > > > > */
> > > > > - clk_core_get_phase(core);
> > > > > + ret = clk_core_get_phase(core);
> > > > > + if (ret < 0) {
> > > > > + pr_warn("%s: Failed to get phase for clk '%s'\n", __func__,
> > > > > + core->name);
> > > > > + goto out;
> > > > > + }
> >
> > The thing is, clk_core_get_phase actually returns the phase on success :)
> >
> > So, when you actually have a phase returned, and not an error, you end
> > up with a positive, non-zero, value for ret.
> >
> > And since it's the latest assignment of that value, and that we return
> > ret all the time, even on success, we end up returning that positive,
> > non-zero value to __clk_register, which in turn tests whether it's
> > non-zero for success (it's not), and then proceeds to garbage collect
> > everything.
> >
> > I guess we're just the odd ones actually returning non-zero phases at
> > init time and in kernelci.
> >
>
> Just to note that not only Allwiner is affected, Rockchip is also
> affected by this issue. Reverting the patch fixes the issue for me,
> but the patch proposed by Maxime [1] does _NOT_ fixes the issue for
> Rockchip, there is something else, I'll take a look. I can't answer
> that patch because didn't reach my inbox.
>
> Regards,
> Enric
>
> [1] https://patchwork.kernel.org/patch/11403837/
>

For the record, the patch that fixes the issue on Rockchip is this:

* https://lkml.org/lkml/2020/3/3/1353

Thanks,
Enric

>
> > I'll send a patch
> >
> > Maxime