Re: beaglebone black boot failure Linux v5.15.rc1

From: Tony Lindgren
Date: Fri Sep 17 2021 - 06:47:23 EST


Hi,

* Vaittinen, Matti <Matti.Vaittinen@xxxxxxxxxxxxxxxxx> [210917 10:29]:
> Oh.. Sorry! I don't know where I picked Tom from... My bad!

No worries :)

> > For me, adding any kind of delay fixed the issue. Also adding some printk
> > statements fixed it for me.
> >
> >> Any suggestions what to check next?
> >
> > Maybe try the attached patch? If it helps, just try with the with the
> > ti,sysc-delay-us = <2> added as few modules need that after enable.
> >
> > It's also possible there is an issue with some other device that is now
> > getting enabled other than pruss. The last XXX printk output should show
> > the last device being probed.
> >
> > Looks like you need to also enable CONFIG_SERIAL_EARLYCON=y, and pass
> > console=ttyS0,115200 debug earlycon in the kernel command line.
>
> Ah. Thanks again. I indeed lacked the "debug earlycon" parameters. Now
> we're more likely to see what went wrong :) I pasted the serial log form
> failing boot with v5.15-rc1 which has both the patch you linked me above
> and the patch you suggested me to test in previous email.

OK thanks.

> [ 2.830347] ti-sysc 4a326000.target-module: XXX sysc_probe
> [ 2.836198] 8<--- cut here ---
> [ 2.839339] Unhandled fault: external abort on non-linefetch (0x1008)
> at 0xe0266000

Yup, this is the pruss target-module@300000 that has the first reg at
4a326000. The oops look very similar to what I was seeing with my bbb.
The external abort means the pruss module is not properly enabled when
accessing the registers.

Not sure what might be different here, presumably all am335x hardware has
the pruss. Maybe try with a larger ti,sysc-delay-us value? I doubt that
helps though as 2 is the most we've seen so far for the delay needed..

Maybe the issue is in omap_reset_deassert(). You could try adding some
printk to omap_reset_deassert() and see if the issue happens right away
or after deasserting the reset. If it's after deasserting the reset, you
could adding delay to the end of omap_reset_deassert() and see if that
helps.

Regards,

Tony