Re: PROBLEM: kernel crashes when running xfsdump since ~6.4

From: Hailong Liu
Date: Thu Jun 20 2024 - 02:37:59 EST


On Thu, 20. Jun 02:19, Nick Bowler wrote:
> Hi,
>
> After upgrading my sparc to 6.9.5 I noticed that attempting to run
> xfsdump instantly (within a couple seconds) and reliably crashes the
> kernel. The same problem is also observed on 6.10-rc4.
>
> This is a regression introduced around 6.4 timeframe. 6.3 appears
> to work fine and xfsdump goes about its business dumping stuff.
>
> Bisection implicates the following:
>
> 062eacf57ad91b5c272f89dc964fd6dd9715ea7d is the first bad commit
> commit 062eacf57ad91b5c272f89dc964fd6dd9715ea7d
> Author: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>
> Date: Thu Mar 30 21:06:38 2023 +0200
>
> mm: vmalloc: remove a global vmap_blocks xarray
>
> This reverts pretty easily on top of v6.10-rc4, as long as I first
> revert fa1c77c13ca5 ("mm: vmalloc: rename addr_to_vb_xarray() function")
> as this just causes conflicts. Then there is one easily-corrected build
> failure (adjust the one remaining &vbq->vmap_blocks back to &vmap_blocks).
>
> If I do all of that then the kernel is not crashing anymore.
>
> A splat like this one is output on the console when the crash occurs (varies a bit):
>
> spitfire_data_access_exception: SFSR[000000000080100d] SFAR[0000000000c51ba0], going.
> \|/ ____ \|/
> "@'/ .. \`@"
> /_| \__/ |_\
> \__U_/
> xfsdump(2028): Dax [#1]
> CPU: 0 PID: 2028 Comm: xfsdump Not tainted 6.9.5 #199
> TSTATE: 0000000811001607 TPC: 0000000000974fc4 TNPC: 0000000000974fc8 Y: 00000000 Not tainted
> TPC: <queued_spin_lock_slowpath+0x1d0/0x2cc>
> g0: 0000000000aa9110 g1: 0000000000c51ba0 g2: 444b000000000000 g3: 0000000000c560c0
> g4: fffff800a71a1f00 g5: fffff800bebb6000 g6: fffff800ac0ec000 g7: 0000000000040000
> o0: 0000000000000002 o1: 00000000000007d8 o2: fffff800a4131420 o3: ffffffff0000ffff
> o4: 00000000900a2001 o5: 0000000000c4f5a0 sp: fffff800ac0eeac1 ret_pc: 0000000000040000
> RPC: <0x40000>
> l0: fffff800a40098c0 l1: 0000000100800000 l2: 0000000000000000 l3: 0000000000000103
> l4: fffff800a40081b0 l5: 0000000000aeec00 l6: fffff800a40080a0 l7: 0000000101000000
> i0: 0000000000c4f5a0 i1: 00000000900a2001 i2: 0000000000000000 i3: fffff800bf807b80
> i4: 0000000000000000 i5: fffff800bf807b80 i6: fffff800ac0eeb71 i7: 0000000000503438
> I7: <vm_map_ram+0x210/0x724>
> Call Trace:
> [<0000000000503438>] vm_map_ram+0x210/0x724
> [<00000000006661f8>] _xfs_buf_map_pages+0x58/0xa0
> [<0000000000667058>] xfs_buf_get_map+0x668/0x7a4
> [<00000000006673e0>] xfs_buf_read_map+0x20/0x160
> [<0000000000667548>] xfs_buf_readahead_map+0x28/0x38
> [<000000000067a4f8>] xfs_iwalk_ichunk_ra.isra.0+0xa8/0xc4
> [<000000000067a8f0>] xfs_iwalk_ag+0x1c0/0x260
> [<000000000067ab08>] xfs_iwalk+0xdc/0x130
> [<0000000000679fc8>] xfs_bulkstat+0x10c/0x140
> [<0000000000695528>] xfs_compat_ioc_fsbulkstat+0x1a4/0x1e8
> [<000000000069572c>] xfs_file_compat_ioctl+0x8c/0x1f4
> [<0000000000534ab0>] compat_sys_ioctl+0x9c/0xfc
> [<0000000000406214>] linux_sparc_syscall32+0x34/0x60
> Disabling lock debugging due to kernel taint
> Caller[0000000000503438]: vm_map_ram+0x210/0x724
> Caller[00000000006661f8]: _xfs_buf_map_pages+0x58/0xa0
> Caller[0000000000667058]: xfs_buf_get_map+0x668/0x7a4
> Caller[00000000006673e0]: xfs_buf_read_map+0x20/0x160
> Caller[0000000000667548]: xfs_buf_readahead_map+0x28/0x38
> Caller[000000000067a4f8]: xfs_iwalk_ichunk_ra.isra.0+0xa8/0xc4
> Caller[000000000067a8f0]: xfs_iwalk_ag+0x1c0/0x260
> Caller[000000000067ab08]: xfs_iwalk+0xdc/0x130
> Caller[0000000000679fc8]: xfs_bulkstat+0x10c/0x140
> Caller[0000000000695528]: xfs_compat_ioc_fsbulkstat+0x1a4/0x1e8
> Caller[000000000069572c]: xfs_file_compat_ioctl+0x8c/0x1f4
> Caller[0000000000534ab0]: compat_sys_ioctl+0x9c/0xfc
> Caller[0000000000406214]: linux_sparc_syscall32+0x34/0x60
> Caller[00000000f789ccdc]: 0xf789ccdc
> Instruction DUMP:
> 8610e0c0
> 8400c002
> c458a0f8
> <f6704002>
> c206e008
> 80a06000
> 12400012
> 01000000
> 81408000
>
> Let me know if you need any more info!
>
> Thanks,
> Nick
>
I guess you can patch this
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-hotfixes-unstable&id=00468d41c20cac748c2e4bfcf003283d554673f5

--
help you, help me,
Hailong.