Re: Kernel Panic - V6.2 - Reseved memory issue

From: Lucas Tanure
Date: Mon Apr 03 2023 - 11:29:04 EST


On Mon, Apr 3, 2023 at 12:29 PM Lucas Tanure <tanure@xxxxxxxxx> wrote:
>
> On Sun, Apr 2, 2023 at 1:55 PM Bagas Sanjaya <bagasdotme@xxxxxxxxx> wrote:
> >
> > On Sun, Apr 02, 2023 at 09:10:36AM +0100, Lucas Tanure wrote:
> > > Log:
> > >
> > > [ 9.792966] SError Interrupt on CPU3, code 0x00000000bf000000 -- SError
> > > [ 9.792980] CPU: 3 PID: 3471 Comm: kded5 Tainted: G C 6.2.0 #1
> > > [ 9.792985] Hardware name: Khadas VIM3 (DT)
> > > [ 9.792987] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > [ 9.792991] pc : kmem_cache_free_bulk.part.98+0x1f0/0x528
> > > [ 9.793004] lr : kmem_cache_free_bulk.part.98+0x2f8/0x528
> > > [ 9.793008] sp : ffff80000a2eb7f0
> > > [ 9.793009] x29: ffff80000a2eb7f0 x28: ffff00001f358518 x27: ffff000000008800
> > > [ 9.793016] x26: ffff00000262b300 x25: ffff00000262b300 x24: 0000000000000001
> > > [ 9.793019] x23: ffff00000262b000 x22: 0000000000000000 x21: ffff00001f358538
> > > [ 9.793022] x20: fffffc0000098ac0 x19: 0000000000000004 x18: 0000000000000040
> > > [ 9.793025] x17: 0000000000000018 x16: 00000000000007f8 x15: 0000000000000003
> > > [ 9.793028] x14: 0000000000000006 x13: ffff800008e48550 x12: 0000ffff9dc91fff
> > > [ 9.793031] x11: 0000000000000004 x10: 0000000000000001 x9 : ffff000007e93680
> > > [ 9.793035] x8 : 0000000000000020 x7 : ffff000001d2b100 x6 : 0000000000000007
> > > [ 9.793037] x5 : 0000000000000020 x4 : ffff000000008800 x3 : 0000000000000001
> > > [ 9.793040] x2 : 0000000000000007 x1 : 0000000000000000 x0 : ffff00001f358540
> > > [ 9.793045] Kernel panic - not syncing: Asynchronous SError Interrupt
> > >
> > > This doesn't happen with downstream Khadas 6.2 kernel, and that's
> > > because the downstream kernel removed this from
> > > early_init_dt_reserve_memory (drivers/of/fdt.c):
> > >
> > > /*
> > > * If the memory is already reserved (by another region), we
> > > * should not allow it to be marked nomap, but don't worry
> > > * if the region isn't memory as it won't be mapped.
> > > */
> > > if (memblock_overlaps_region(&memblock.memory, base, size) &&
> > > memblock_is_region_reserved(base, size))
> > > return -EBUSY;
> > >
> >
> > What commit on downstream kernel that fix the issue?
> Here:
> https://github.com/khadas/linux/commit/2cb57b1071bf69f615fedc999b7ecacf2cde7228
>
> Can you reproduce
> > on mainline with above conditional removed?
> No, without that code mainline works fine.
>
>
> Alternatively, can
> > you post the downstream fix here?
> Same https://github.com/khadas/linux/commit/2cb57b1071bf69f615fedc999b7ecacf2cde7228
>
> >
> > Also, can you find last working commit on mainline? If so, this is
> > regression.
> That is difficult as 5.13.0 has the line:
> OF: fdt: Reserved memory: failed to reserve memory for node
> 'secmon@5000000': base 0x0000000005000000, size 3 MiB
> But doesn't crash. It could be that no process used that address so no crash.
>
> >
> > Thanks.
> >
> > --
> > An old man doll... just what I always wanted! - Clara

Hi,
I git bisect point it out commit that even reverting it would trigger the panic.
So this is a memory corruption problem that a simple git bisect will
not find the correct offending commit.

I managed to understand a little more about the issue:
1 ) early_init_fdt_scan_reserved_mem is executed first, reserves
[0x0000000005000000-0x00000000052fffff] but doesn't mark as no-map.
2 ) early_init_dt_reserve_memory tries to mark that region as nomap,
but it is already reserved and memblock_overlaps_region and
memblock_is_region_reserved return true, so it fails to mark as nomap.
3 ) kernel uses that memory and crashes

I think we have two options here:
1) Allow early_init_dt_reserve_memory mark nomap memory that is already reserved
2) Make early_init_fdt_scan_reserved_mem reserve with the flag nomap
if necessary. I don't know if that's possible.

Question MM guys, Mike Rapoport and Andrew Morton:
- Is it possible to make early_init_fdt_scan_reserved_mem reserve
memory with flags?
- It ok for early_init_fdt_scan_reserved_mem to mark regions already
reserved as nomap?

Thanks
Lucas