Re: [v5.3-rc1 regression] Hitting a kernel BUG() when trying to load a module on DaVinci SoC

From: Bartosz Golaszewski
Date: Tue Jul 23 2019 - 03:45:57 EST


pon., 22 lip 2019 o 18:10 Jessica Yu <jeyu@xxxxxxxxxx> napisaÅ(a):
>
> +++ Bartosz Golaszewski [22/07/19 14:12 +0200]:
> >Hi,
> >
> >with v5.3-rc1 I'm hitting the following BUG() when trying to load the
> >gpio-backlight module:
> >
> >kernel BUG at kernel/module.c:1919!
> >Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
> >Modules linked in:
> >CPU: 0 PID: 1 Comm: systemd Tainted: G W
> >5.2.0-rc2-00005-g7dabaa5ce05a #19
> >Hardware name: DaVinci DA850/OMAP-L138/AM18x EVM
> >PC is at frob_text.constprop.16+0x2c/0x34
> >LR is at load_module+0x1888/0x21b4
> >pc : [<c0081bbc>] lr : [<c0083c4c>] psr: 20000013
> >sp : c6837e58 ip : c6b4fa80 fp : bf00574c
> >r10: c0601008 r9 : bf005740 r8 : c0493e38
> >r7 : c00807f8 r6 : 00000000 r5 : 00000001 r4 : c6837f38
> >r3 : 00000fff r2 : bf000000 r1 : 00004b80 r0 : bf005818
> >Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> >Control: 0005317f Table: c6b78000 DAC: 00000051
> >Process systemd (pid: 1, stack limit = 0x(ptrval))
> >Stack: (0xc6837e58 to 0xc6838000)
> >7e40: bf005740 c6492a90
> >7e60: 00000001 c6bf5788 00000003 c6837f38 bf0058c0 bf0058a8 bf00a495 c0493e38
> >7e80: 00000000 c05368c0 00000001 00000000 c0580030 c0574b28 00000000 00000000
> >7ea0: 00000000 00000000 00000000 00000000 6e72656b 00006c65 00000000 00000000
> >7ec0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> >7ee0: 00000000 00000000 00000000 aefb2bb8 7fffffff c0601008 00000000 00000004
> >7f00: b6cee714 c00091e4 c6836000 00000000 005bc668 c00847b0 7fffffff 00000000
> >7f20: 00000003 00000000 0000b2d0 c884e000 0000b2d0 00000000 c8852dcb c88533a0
> >7f40: c884e000 0000b2d0 c8858cb8 c8858b34 c88568d4 000058c0 00006190 000023b8
> >7f60: 00006713 00000000 00000000 00000000 000023a8 00000024 00000025 00000019
> >7f80: 0000001d 00000011 00000000 aefb2bb8 00000000 00000000 00000000 00000000
> >7fa0: 0000017b c0009000 00000000 00000000 00000004 b6cee714 00000000 00000000
> >7fc0: 00000000 00000000 00000000 0000017b 00000000 00000000 00000001 005bc668
> >7fe0: be907b00 be907af0 b6ce66b0 b6c3fac0 60000010 00000004 00000000 00000000
> >[<c0081bbc>] (frob_text.constprop.16) from [<c0083c4c>]
> >(load_module+0x1888/0x21b4)
> >[<c0083c4c>] (load_module) from [<c00847b0>] (sys_finit_module+0xbc/0xdc)
> >[<c00847b0>] (sys_finit_module) from [<c0009000>] (ret_fast_syscall+0x0/0x50)
> >Exception stack(0xc6837fa8 to 0xc6837ff0)
> >7fa0: 00000000 00000000 00000004 b6cee714 00000000 00000000
> >7fc0: 00000000 00000000 00000000 0000017b 00000000 00000000 00000001 005bc668
> >7fe0: be907b00 be907af0 b6ce66b0 b6c3fac0
> >Code: e1a01621 e1a00002 eafe4531 e7f001f2 (e7f001f2)
> >---[ end trace 2cbefb0005882c52 ]---
> >Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> >---[ end Kernel panic - not syncing: Attempted to kill init!
> >exitcode=0x0000000b ]---
> >
> >I bisected it to commit 06bd260e836d ("modules: fix BUG when load
> >module with rodata=n") with commit 7dabaa5ce05a ("modules: fix compile
> >error if don't have strict module rwx") on top to make it build.
> >
> >Let me know if you need me to provide more info.
> >
> >Best regards,
> >Bartosz Golaszewski
>
> Hi Bartosz,
>
> Thanks for reporting this, I was able to reproduce this on qemu.
>
> This is due to hitting the BUG_ON() in frob_text() due to layout->base
> and/or layout->text_size not being page-aligned. These values are
> always page-aligned when CONFIG_STRICT_MODULE_RWX=y, but in commit
> 2eef1399a86 ("modules: fix BUG when load module with rodata=n"), the
> frob_text()+set_memory_x() calls got moved *outside* of the
> STRICT_MODULE_RWX block since some arches (like x86 and arm64)
> allocate non-executable module memory via module_alloc(), so naturally
> the module text needed to be made executable at a later stage of
> load_module(), regardless of whether STRICT_MODULE_RWX is set or not.
> In your case, you must've had CONFIG_STRICT_MODULE_RWX=n and so we
> were calling frob_text() with non-page-aligned values, triggering the
> BUG_ON().
>
> In any case, could you please try and see if the following patch fixes
> the issue for you?
>
> diff --git a/kernel/module.c b/kernel/module.c
> index 5933395af9a0..cd8df516666d 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -64,14 +64,9 @@
>
> /*
> * Modules' sections will be aligned on page boundaries
> - * to ensure complete separation of code and data, but
> - * only when CONFIG_STRICT_MODULE_RWX=y
> + * to ensure complete separation of code and data
> */
> -#ifdef CONFIG_STRICT_MODULE_RWX
> # define debug_align(X) ALIGN(X, PAGE_SIZE)
> -#else
> -# define debug_align(X) (X)
> -#endif
>
> /* If this is set, the section belongs in the init part of the module */
> #define INIT_OFFSET_MASK (1UL << (BITS_PER_LONG-1))

That helps, thanks!

Please Cc me on the actual submitted patch.

Bart