Re: [PATCH v4 1/2] of: create of_root if no dtb provided

From: Rob Herring
Date: Wed Mar 27 2024 - 14:38:36 EST


On Wed, Mar 27, 2024 at 9:40 AM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> On 3/27/24 06:11, Rob Herring wrote:
> > On Wed, Mar 20, 2024 at 3:06 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> >>
> >> On 3/20/24 12:14, Rob Herring wrote:
> >>> On Mon, Mar 18, 2024 at 4:31 PM Guenter Roeck <linux@roeck-usnet> wrote:
> >>>>
> >>>> On 3/18/24 12:26, Rob Herring wrote:
> >>>>> +Stephen
> >>>>>
> >>>>> On Mon, Mar 18, 2024 at 12:09 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> On Fri, Mar 17, 2023 at 12:34:14AM -0500, Frank Rowand wrote:
> >>>>>>> When enabling CONFIG_OF on a platform where of_root is not populated by
> >>>>>>> firmware, we end up without a root node. In order to apply overlays and
> >>>>>>> create subnodes of the root node, we need one. Create this root node
> >>>>>>> by unflattening an empty builtin dtb.
> >>>>>>>
> >>>>>>> If firmware provides a flattened device tree (FDT) then the FDT is
> >>>>>>> unflattened via setup_arch(). Otherwise setup_of(), which is called
> >>>>>>> immediately after setup_arch(), will create the default root node
> >>>>>>> if it does not exist.
> >>>>>>>
> >>>>>>> Signed-off-by: Frank Rowand <frowand.list@xxxxxxxxx>
> >>>>>>
> >>>>>> This patch results in a crash on nios2.
> >>>>>
> >>>>> This patch was never applied. I assume you meant a later version of it
> >>>>> that did get applied.
> >>>>>
> >>>>>>
> >>>>>> Building nios2:10m50-ghrd:10m50_defconfig:10m50_devboard.dts ... running ...R failed (crashed)
> >>>>>
> >>>>> Booting with DT?
> >>>>>
> >>>>>> ------------
> >>>>>> qemu log:
> >>>>>> earlycon: uart8250 at MMIO32 0x18001600 (options '')
> >>>>>> printk: legacy bootconsole [uart8250] enabled
> >>>>>> Linux version 6.8.0-11409-gf6cef5f8c37f (groeck@desktop) (nios2-linux-gcc (GCC) 11.4.0, GNU ld (GNU Binutils) 2.40) #1 Sun Mar 17 23:38:59 PDT 2024
> >>>>>> Kernel panic - not syncing: early_init_dt_alloc_memory_arch: Failed to allocate 72 bytes align=0x40
> >>>>>> ---[ end Kernel panic - not syncing: early_init_dt_alloc_memory_arch: Failed to allocate 72 bytes align=0x40 ]---
> >>>>>
> >>>>> nios2 looks utterly broken to me. This change should be a nop unless
> >>>>> initial_boot_params is NULL. It looks like it is possible for r6 (dtb
> >>>>> address) to be 0 depending on kconfig options, but that would have
> >>>>> skipped copying and unflattening which would then panic in
> >>>>> setup_cpuinfo(). If initial_boot_params is not NULL, then the same
> >>>>> early_init_dt_alloc_memory_arch() calls should fail when copying the
> >>>>> DT. So I don't see how nios2 booting with DT ever worked.
> >>>>>
> >>>>
> >>>> For nios2, in early_init_devtree():
> >>>>
> >>>> void __init early_init_devtree(void *params)
> >>>> {
> >>>> __be32 *dtb = (u32 *)__dtb_start;
> >>>> ...
> >>>> if (be32_to_cpu((__be32) *dtb) == OF_DT_HEADER)
> >>>> params = (void *)__dtb_start;
> >>>>
> >>>> That worked fine until this patch. Starting with this patch, __dtb_start
> >>>> always points to a valid empty devicetree blob, which overrides the
> >>>> devicetree blob passed to early_init_devtree(). This causes the problem.
> >>>
> >>> With an external DTB, it doesn't boot with or without this patch. It
> >>> just dies in different spots. Before it just skipped any memory
> >>
> >> No, that is incorrect.
> >
> > Well, I can tell you it doesn't boot for me. So I must be doing
> > something different from your setup.
> >
>
> Maybe you have OF_UNITTEST enabled and it indeed results in the
> problem you mention below. I don't have it enabled because it produces
> various backtraces which would hide real problems.

I thought of that, but I don't think I did. What I suspect is the
external dtb is at address 0.

> >> Up to this patch it booted just fine with an
> >> external dtb using the "-initrd" command line argument, and I explained
> >> to you above why this is the case.
> >
> > What does -initrd have to do with anything? Does that shift where the
> > external dtb is placed or something?
> >
>
> Nothing. I meant to say -dtb.
>
> > I think I see the issue. __dtb_start points to the start of *all*
> > built-in DTBs, not a specific one. In this case, arc, csky, loongarch,
> > mips, openrisc, riscv, sh, and xtensa may all be broken too (if one
> > picks the magic combination of booting modes and kconfig options). I
>
> No.
>
> - arc only picks the internal dtb if use_embedded_dtb is true. This flag
> is only set if there is no external dtb, or if the external dtb does
> not provide a valid machine description.

Right, but when it does pick the internal dtb, it is expecting its dtb
at __dtb_start. What I'm saying is that's never been a good or safe
assumption. We just happened to add another case to trigger it. The
only reliable way to get a built-in DTB is if foo.dtb is built-in,
then use __dtb_foo_begin to get its address. That's what some MIPS
platforms with multiple DTBs do.

> - openrisc only picks the internal dtb if no external dtb is provided.
> - riscv only picks the internal dtb if CONFIG_BUILTIN_DTB is enabled.
> - sh only used the internal dtb if CONFIG_USE_BUILTIN_DTB is enabled.
> - xtensa only picks the internal dtb if there is no external dtb.
>
> However, nios2 picks the internal dtb _even if_ an external dtb
> is provided if there is an internal dtb. In other words, it prefers
> the internal dtb over the external dtb. All other architectures
> prefer the external dtb over the internal dtb.

Thanks for the analysis. I had started and abandoned common support
(mostly Kconfig options) for built-in dtbs years ago. I decided
against it because it is not something we want to encourage (as the
boot dtb). In the meantime, we've gained new architectures that have
added it. Sigh. So now I'm reconsidering something common (though not
for v6.9).

>
> > would expect all these cases have been broken forever if the DT
> > unittest is enabled as it too adds a built-in dtb. But I would also
>
> Even if that is correct for nios2, that hardly seems to be an argument
> to break nios2 boot with external dtb unconditionally.

That wasn't an argument for breaking it. Using an external dtb should
really be the default and strongly preferred though.

I'm still not sure how to fix this easily for 6.9. Something like what
microblaze does which puts the boot dtb under a consistent symbol
name. Or perhaps we could iterate thru the built-in dtbs and skip ones
without top-level compatible.

Rob