[powerpc]Kernel crash while running xfstests (generic/250) [next-20220404]

From: Sachin Sant
Date: Mon Apr 04 2022 - 07:34:55 EST


While running xfstests(ext4 or XFS as fs) on a Power10 LPAR booted with today’s
next (5.18.0-rc1-next-20220404) following crash is seen.

[ 51.260209] XFS (dm-0): Unmounting Filesystem
[ 51.262949] XFS (dm-0): Mounting V5 Filesystem
[ 51.270524] XFS (dm-0): Ending clean mount
[ 51.272641] xfs filesystem being mounted at /mnt/scratch supports timestamps until 2038 (0x7fffffff)
[ 51.377505] XFS (dm-0): Unmounting Filesystem
[ 51.397584] BUG: Unable to handle kernel data access at 0x5deadbeef0000122
[ 51.397591] Faulting instruction address: 0xc0000000001561bc
[ 51.397595] Oops: Kernel access of bad area, sig: 11 [#1]
[ 51.397598] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 51.397602] Modules linked in: xfs dm_mod ip_set rfkill nf_tables bonding libcrc32c nfnetlink sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio sch_fq_codel ext4 mbcache jbd2 sd_mod t10_pi crc64_rocksoft crc64 sg ibmvscsi ibmveth scsi_transport_srp fuse
[ 51.397626] CPU: 3 PID: 3448 Comm: dmsetup Not tainted 5.18.0-rc1-next-20220404 #16
[ 51.397630] NIP: c0000000001561bc LR: c0000000001560e8 CTR: c000000000672ef0
[ 51.397633] REGS: c000000095c9b610 TRAP: 0380 Not tainted (5.18.0-rc1-next-20220404)
[ 51.397636] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24024824 XER: 00000000
[ 51.397646] CFAR: c0000000001560f0 IRQMASK: 0
[ 51.397646] GPR00: c0000000001560e8 c000000095c9b8b0 c000000002a03800 0000000000000000
[ 51.397646] GPR04: c000000017a1ab78 0000000000000000 c00000002cab6ac0 c000000093e73900
[ 51.397646] GPR08: c000000093e73900 5deadbeef0000100 5deadbeef0000122 c008000001b5a4e8
[ 51.397646] GPR12: c000000000672ef0 c000000abfff8e80 000000013dbd0b60 00007fff849e9da8
[ 51.397646] GPR16: 00007fff849e9da8 00007fff849e9da8 00007fff84a23670 0000000000000000
[ 51.397646] GPR20: 00007fff849f3388 00007fff84a22040 000000013dbd0b90 0000000000000131
[ 51.397646] GPR24: c00000000254d768 ffffffffffff0000 c00000000254d730 c000000027668e00
[ 51.397646] GPR28: c0000000029b0170 c000000017a1ab78 0000000000000017 0000000000000000
[ 51.397684] NIP [c0000000001561bc] __cpuhp_state_remove_instance+0x19c/0x2c0
[ 51.397692] LR [c0000000001560e8] __cpuhp_state_remove_instance+0xc8/0x2c0
[ 51.397697] Call Trace:
[ 51.397698] [c000000095c9b8b0] [c0000000001560e8] __cpuhp_state_remove_instance+0xc8/0x2c0 (unreliable)
[ 51.397705] [c000000095c9b920] [c000000000672f4c] bioset_exit+0x5c/0x280
[ 51.397709] [c000000095c9b9c0] [c008000001b433f4] cleanup_mapped_device+0x4c/0x1a0 [dm_mod]
[ 51.397721] [c000000095c9ba00] [c008000001b436f0] __dm_destroy+0x1a8/0x360 [dm_mod]
[ 51.397730] [c000000095c9baa0] [c008000001b50e90] dev_remove+0x1a8/0x280 [dm_mod]
[ 51.397740] [c000000095c9bb30] [c008000001b5115c] ctl_ioctl+0x1f4/0x7c0 [dm_mod]
[ 51.397750] [c000000095c9bd40] [c008000001b51748] dm_ctl_ioctl+0x20/0x40 [dm_mod]
[ 51.397759] [c000000095c9bd60] [c0000000004b1f68] sys_ioctl+0xf8/0x150
[ 51.397763] [c000000095c9bdb0] [c00000000003373c] system_call_exception+0x18c/0x390
[ 51.397767] [c000000095c9be10] [c00000000000c64c] system_call_common+0xec/0x270
[ 51.397772] --- interrupt: c00 at 0x7fff84329210
[ 51.397776] NIP: 00007fff84329210 LR: 00007fff849e6824 CTR: 0000000000000000
[ 51.397780] REGS: c000000095c9be80 TRAP: 0c00 Not tainted (5.18.0-rc1-next-20220404)
[ 51.397785] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 24004484 XER: 00000000
[ 51.397795] IRQMASK: 0
[ 51.397795] GPR00: 0000000000000036 00007ffffdb43030 00007fff84407300 0000000000000003
[ 51.397795] GPR04: 00000000c138fd04 000000013dbd0b60 0000000000000004 00007fff849f3f98
[ 51.397795] GPR08: 0000000000000003 0000000000000000 0000000000000000 0000000000000000
[ 51.397795] GPR12: 0000000000000000 00007fff84acfa80 000000013dbd0b60 00007fff849e9da8
[ 51.397795] GPR16: 00007fff849e9da8 00007fff849e9da8 00007fff84a23670 0000000000000000
[ 51.397795] GPR20: 00007fff849f3388 00007fff84a22040 000000013dbd0b90 000000013dbd02e0
[ 51.397795] GPR24: 00007fff849e9da8 00007fff849e9da8 00007fff849e9da8 00007fff849e9da8
[ 51.397795] GPR28: 0000000000000001 00007fff849e9da8 0000000000000000 00007fff849e9da8
[ 51.397829] NIP [00007fff84329210] 0x7fff84329210
[ 51.397831] LR [00007fff849e6824] 0x7fff849e6824
[ 51.397834] --- interrupt: c00
[ 51.397835] Instruction dump:
[ 51.397838] 60000000 7f69db78 7f83e040 7c7f07b4 7bea1f24 419cffb4 eae10028 eb210038
[ 51.397844] eb610048 e93d0000 e95d0008 2fa90000 <f92a0000> 419e0008 f9490008 3d405dea
[ 51.397850] ---[ end trace 0000000000000000 ]---
[ 51.400133]
[ 52.400136] Kernel panic - not syncing: Fatal exception

This problem was possibly introduced with 5.17.0-next-20220330.
Git bisect leads me to following patch
commit 1d158814db8e7b3cbca0f2c8d9242fbec4fbc57e
dm: conditionally enable BIOSET_PERCPU_CACHE for dm_io bioset

-Sachin