Re: Kernel Panic - V6.2 - Reseved memory issue

From: Christian Hewitt
Date: Sun Apr 02 2023 - 09:12:22 EST


> On 2 Apr 2023, at 12:10 pm, Lucas Tanure <tanure@xxxxxxxxx> wrote:
>
> Hi,
>
> I am trying to fix a kernel panic I am seeing on my vim3 board (Amlogic A311D).
> I don't have enough knowledge about this area, but my current guess is
> the kernel is using a piece of memory belonging to ARM-trusted
> firmware that I shouldn't.
> Log:
>
> [ 9.792966] SError Interrupt on CPU3, code 0x00000000bf000000 -- SError
> [ 9.792980] CPU: 3 PID: 3471 Comm: kded5 Tainted: G C 6.2.0 #1
> [ 9.792985] Hardware name: Khadas VIM3 (DT)
> [ 9.792987] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 9.792991] pc : kmem_cache_free_bulk.part.98+0x1f0/0x528
> [ 9.793004] lr : kmem_cache_free_bulk.part.98+0x2f8/0x528
> [ 9.793008] sp : ffff80000a2eb7f0
> [ 9.793009] x29: ffff80000a2eb7f0 x28: ffff00001f358518 x27: ffff000000008800
> [ 9.793016] x26: ffff00000262b300 x25: ffff00000262b300 x24: 0000000000000001
> [ 9.793019] x23: ffff00000262b000 x22: 0000000000000000 x21: ffff00001f358538
> [ 9.793022] x20: fffffc0000098ac0 x19: 0000000000000004 x18: 0000000000000040
> [ 9.793025] x17: 0000000000000018 x16: 00000000000007f8 x15: 0000000000000003
> [ 9.793028] x14: 0000000000000006 x13: ffff800008e48550 x12: 0000ffff9dc91fff
> [ 9.793031] x11: 0000000000000004 x10: 0000000000000001 x9 : ffff000007e93680
> [ 9.793035] x8 : 0000000000000020 x7 : ffff000001d2b100 x6 : 0000000000000007
> [ 9.793037] x5 : 0000000000000020 x4 : ffff000000008800 x3 : 0000000000000001
> [ 9.793040] x2 : 0000000000000007 x1 : 0000000000000000 x0 : ffff00001f358540
> [ 9.793045] Kernel panic - not syncing: Asynchronous SError Interrupt
>
> This doesn't happen with downstream Khadas 6.2 kernel, and that's
> because the downstream kernel removed this from
> early_init_dt_reserve_memory (drivers/of/fdt.c):
>
> /*
> * If the memory is already reserved (by another region), we
> * should not allow it to be marked nomap, but don't worry
> * if the region isn't memory as it won't be mapped.
> */
> if (memblock_overlaps_region(&memblock.memory, base, size) &&
> memblock_is_region_reserved(base, size))
> return -EBUSY;
>
>
> And this causes 3 MiB of memory belonging to ARM Trusted firmware to
> be reserved.
>
> arch/arm64/boot/dts/amlogic/meson-g12-common.dtsi :
> /* 3 MiB reserved for ARM Trusted Firmware (BL31) */
> secmon_reserved: secmon@5000000 {
> reg = <0x0 0x05000000 0x0 0x300000>;
> no-map;
> };
>
> And the mainline kernel fails to reserve that memory:
> [ 0.000000] OF: fdt: Reserved memory: failed to reserve memory for
> node 'secmon@5000000': base 0x0000000005000000, size 3 MiB
>
> It fails to reserve because memblock_overlaps_region and
> memblock_is_region_reserved return one.
> I think memblock_is_region_reserved is saying the memory is already
> reserved by uboot and shouldn't be nomap, but it should.
>
> Is there a bug here?
> Why the kernel is failing to reserve this memory?
> Is this an u-boot issue?
>
> I would appreciate any help. The current mainline kernel fails 90% of
> the time to boot into the Vim3 board.

The issue was raised before by Stefan Agner here:

https://lore.kernel.org/linux-arm-kernel/40ca11f84b7cdbfb9ad2ddd480cb204a@xxxxxxxx/

The thread sort of points at the general issue but the conversation
fizzled out and didn’t lead to any changes. At one point Stefan made
a suggestion about reverting part of the code, leading to this patch
in my own patchset:

https://github.com/chewitt/linux/commit/9633c9b24f6f16afdb7fa8c2e163b6ea7a7ac5f8

The issue is still present and the patch does work around it. The
crashes would probably show up more, only a large percentage of
distros that actively support Amlogic boards (and several vendors)
are picking chunks of my curated LibreELEC patchset for their own
kernels and thus that patch is quite widely used.

Christian