Re: beaglebone black boot failure Linux v5.15.rc1
From: Vaittinen, Matti
Date: Fri Sep 17 2021 - 07:35:15 EST
Thanks a lot guys!
On 9/17/21 14:01, Grygorii Strashko wrote:
>
>
> On 17/09/2021 13:57, Grygorii Strashko wrote:
>>
>>
>> On 17/09/2021 13:28, Vaittinen, Matti wrote:
>>> Hi deeee Ho Tony & All,
>>>
>>> On 9/17/21 09:14, Tony Lindgren wrote:
>>>> Hi,
>>>>
>>>> * Vaittinen, Matti <Matti.Vaittinen@xxxxxxxxxxxxxxxxx> [210916 09:15]:
>>>
>>>>> My beaglebone black (rev c) based test environment fails to boot with
>>>>> v5.15-rc1. Boot succeeds with the v5.14.
>>>>>
>>>>> Bisecting the Linus' tree pointed out the commit:
>>>>> [1c7ba565e70365763ea780666a3eee679344b962] ARM: dts: am335x-baltos:
>>>>> switch to new cpsw switch drv
>>>>>
>>>>> I don't see this exact commit touching the BBB device-tree. In Linus'
>>>>> tree it is a part of a merge commit. Reverting the whole merge on
>>>>> top of
>>>>> the v5.15-rc1
>>>>>
>>>>> This reverts commit 81b6a285737700c2e04ef0893617b80481b6b4b7,
>>>>> reversing
>>>>> changes made to f73979109bc11a0ed26b6deeb403fb5d05676ffc.
>>>>>
>>>>> makes my beaglebone black to boot again.
>>>>>
>>>>> Yesterday I tried adding this patch:
>>>>> https://lore.kernel.org/linux-omap/20210915065032.45013-1-tony@xxxxxxxxxxx/T/#u
>>>>>
>>>>> pointed by Tom on top of the v5.15-rc1 - no avail. I also did #define
>>>>> DEBUG at ti-sys.c as was suggested by Tom - but I don't see any
>>>>> more output.
>>>>
>>>> Correction, that was me, not Tom :)
>>>
>>> Oh.. Sorry! I don't know where I picked Tom from... My bad!
>>>
>>>> For me, adding any kind of delay fixed the issue. Also adding some
>>>> printk
>>>> statements fixed it for me.
>>>>
>>>>> Any suggestions what to check next?
>>>>
>>>> Maybe try the attached patch? If it helps, just try with the with the
>>>> ti,sysc-delay-us = <2> added as few modules need that after enable.
>>>>
>>>> It's also possible there is an issue with some other device that is now
>>>> getting enabled other than pruss. The last XXX printk output should
>>>> show
>>>> the last device being probed.
>>>>
>>>> Looks like you need to also enable CONFIG_SERIAL_EARLYCON=y, and pass
>>>> console=ttyS0,115200 debug earlycon in the kernel command line.
>>>
>>> Ah. Thanks again. I indeed lacked the "debug earlycon" parameters. Now
>>> we're more likely to see what went wrong :) I pasted the serial log form
>>> failing boot with v5.15-rc1 which has both the patch you linked me above
>>> and the patch you suggested me to test in previous email.
>>>
>>
This really feels like an timing/synchronization issue. Adding various
prints to
I tried adding prints to omap_reset_deassert() made the Ooops to go
away. I suspect the prints did change timing just the needed bit. Later
the boot hanged to NFS mount failing though - but that may also be
problem on the NFS server side. (I jave a new laptop and I am still
trying to set-up my development environment there.)
>> [...]
>>
>>> [ 2.786181] ti-sysc 48311fe0.target-module: XXX sysc_probe
>>> [ 2.791994] ti-sysc 48311fe0.target-module:
>>> 48310000:2000:1fe0:1fe4:NA:00000020:rng
>>> [ 2.800820] omap_rng 48310000.rng: Random Number Generator ver. 20
>>> [ 2.807315] random: crng init done
>>> [ 2.814207] ti-sysc 4a101200.target-module: XXX sysc_probe
>>> [ 2.820080] ti-sysc 4a101200.target-module:
>>> 4a100000:8000:1200:1208:1204:4edb0100:cpgmac
>>
>> This one cpsw
>>
>>> [ 2.830347] ti-sysc 4a326000.target-module: XXX sysc_probe
>>
>> This one pruss and it still shows sysc_probe
>>
>> Not sure what are the dependency here :( if any.
>>
>> Additional option to try - cmdline param "initcall_debug" and maybe
>> increase print level in really_probe_debug()
>>
>
> Just to be clear - idea is to see *all* probes - not only sysc.
>
> [...]
>
I added initcall_debug && changed the pr_debug() to pr_err() in
really_probe_debug(). Log from that run is attached. The
omap_reset_deassert() was not instrumented to print/delay for this run.
Best Regards
Matti Vaittinen
Attachment:
bbb_boot_pruss_minicom_2.cap
Description: bbb_boot_pruss_minicom_2.cap