Re: Boot failure on gru-scarlet-inx with 5.9-rc2

From: Samuel Dionne-Riel
Date: Sun Aug 30 2020 - 16:19:39 EST


On Sun, 30 Aug 2020 10:41:42 +0100
Marc Zyngier <maz@xxxxxxxxxx> wrote:

Hi,

Thanks for the reply.

> Hi Samuel,
>
> On 2020-08-29 21:54, Samuel Dionne-Riel wrote:
> > Hi,
> >
> > The patch "of: address: Work around missing device_type property in
> > pcie nodes" by Marc Zyngier,
> > d1ac0002dd297069bb8448c2764c9c31c4668441, causes the "DUMO" variant
> > of the gru-scarlet-inx, at the very least, to not boot. A gru-kevin
> > reportedly had no issues booting (further), though it most likely
> > had a different kernel configuration.
>
> Do you have a pointer to the device-tree for this system? I couldn't
> spot anything amiss in the scarlet-inx DT, but I'm not sure the
> system you have is that exact one. Even a DTB would help.

Is "arch/arm64/boot/dts/rockchip/rk3399-gru-scarlet-inx.dts" what you
wanted? The FDT in use is the one that's present in the kernel tree for
the same revision. The one with the `compatible` property with a bunch
of `google,scarlet-rev*-sku6`. The build process for the kernel
partition (booting using depthcharge) ensures they are in sync always.

In any cases, from previous discussions with people involved with the
scarlet development, the only difference between all scarlets are the
display on some (innolux vs. kingdisplay). I would expect (and hope)
the issue would be the same on all.

> The fact that Kevin still boots is a good indication that the issue
> could be with with the board-specific changes layered on top of the
> GRU base. My own rk3399 systems are running with this patch.
>
> > Using a SuzyQ cable, there is absolutely no serial output at boot,
> > while reverting the commit (and this commit alone) on top of
> > v5.9-rc2 works just as it did with v5.9-rc1.
>
> Do you have "earlycon" on the kernel command-line?

I did not, I thought earlyprintk was earlycon... I had:

console=ttyS2,115200n8 earlyprintk=ttyS2,115200n8

Now with wither "earlycon" or "earlycon=uart8250,mmio32,0xff1a0000" I
somehow can't get any output, and it's not booting. That is with and
without the problematic patch, and also verified on v5.8. Odd.

So I would say that I don't have earlycon, and maybe can't? I'm open to
suggestions.

> > From this point on, I don't know what's the usual process, so bear
> > with me if I forgot to provide relevant information, or made a
> > faux-pas by CC-ing too many people or not enough.
>
> No need to worry, and thank you for reporting the issue.
>
> Could you try replacing the problematic patch with [1], and let me
> know whether this changes anything on your end? This patch probably
> isn't the right approach, but it would certainly help pointing me
> in the right direction.
>
> [1]
> https://lore.kernel.org/lkml/20200815125112.462652-2-maz@xxxxxxxxxx/

On top of v5.9-rc2 + revert d1ac0002dd297069bb8448c2764c9c31c4668441

$ curl https://lore.kernel.org/lkml/[...]/raw | git am

With the patch, with and without (the probably bad) earlycon, I get
the same result, hang at boot, no serial output.

Again, knowing that the hardware is not necessarily in the hands of
everyone, I'll be glad to try patches and configurations proposed to
further the understanding of the issues.