Re: POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

From: Qian Cai
Date: Thu Apr 16 2020 - 22:40:51 EST




> On Apr 16, 2020, at 10:27 PM, Russell Currey <ruscur@xxxxxxxxxx> wrote:
>
> Reverting the patch with the given config will have the same effect as
> STRICT_KERNEL_RWX=n. Not discounting that it could be a bug on the
> powerpc side (i.e. relocatable kernels with strict RWX on haven't been
> exhaustively tested yet), but we should definitely figure out what's
> going on with this bad access first.

BTW, this bad access only happened once. The overwhelming rest of crashes are with NULL pointer NIP like below. How can you explain that STRICT_KERNEL_RWX=n would also make those NULL NIP disappear if STRICT_KERNEL_RWX is just a messenger?

[ 215.281666][T16896] LTP: starting chown04_16
[ 215.424203][T18297] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[ 215.424289][T18297] Faulting instruction address: 0x00000000
[ 215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
[ 215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
[ 215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
[ 215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted: G W 5.6.0-next-20200405+ #3
[ 215.424489][T18297] NIP: 0000000000000000 LR: c00800000fbc0408 CTR: 0000000000000000
[ 215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400 Tainted: G W (5.6.0-next-20200405+)
[ 215.424570][T18297] MSR: 9000000040009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84000248 XER: 20040000
[ 215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0
[ 215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20 c00000000165ce00 0000000000000000
[ 215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0 000000005f037e7d ffffffff00013bfb
[ 215.424619][T18297] GPR08: c000201a58106400 0000000000000000 0000000000000000 c000000001652ee0
[ 215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600 0000000000000000 0000000000000000
[ 215.424619][T18297] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 215.424619][T18297] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000007
[ 215.424619][T18297] GPR24: 0000000000000000 0000000000000000 c00800000fbc8688 c000200b8606fcc0
[ 215.424619][T18297] GPR28: 0000000000000000 000000007fffffff c00800000fbc0400 c00020068b8c0e70
[ 215.424914][T18297] NIP [0000000000000000] 0x0
[ 215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30 [loop]
find_free_cb at drivers/block/loop.c:2129
[ 215.424997][T18297] Call Trace:
[ 215.425036][T18297] [c000200b8606fc20] [c0000000006c2290] idr_for_each+0xf0/0x170 (unreliable)
[ 215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744] loop_lookup.part.2+0x4c/0xb0 [loop]
loop_lookup at drivers/block/loop.c:2144
[ 215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558] loop_control_ioctl+0x120/0x1d0 [loop]
[ 215.425149][T18297] [c000200b8606fd40] [c0000000004eb688] ksys_ioctl+0xd8/0x130
[ 215.425190][T18297] [c000200b8606fd90] [c0000000004eb708] sys_ioctl+0x28/0x40
[ 215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30] system_call_exception+0x110/0x1e0
[ 215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0] system_call_common+0xf0/0x278
[ 215.425314][T18297] Instruction dump:
[ 215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[ 215.425422][T18297] ---[ end trace ebed248fad431966 ]---
[ 215.642114][T18297]
[ 216.642220][T18297] Kernel panic - not syncing: Fatal exception