dm-snap: there are issues for the rt kernel
From: Jiping Ma
Date: Fri Nov 28 2025 - 02:25:46 EST
We observed instability for the rt kernel under the upgreade and rollback test in 6.6, 6.12 and mainline.
The issue is related with dm_exception_table_lock(&lock), in which function preempt_disable() is called twice.
The code block is between dm_exception_table_lock(&lock) and dm_exception_table_unlock(&lock),if the code involves rt_spin_lock that will trigger such as "BUG: scheduling while atomic: kworker/u72:11/349/0x00000003" because the preempt number is 3 in this time.
There are several places that involve the same issue in dm-snap.c, such as dm_add_exception(), pending_complete() and snapshot_map().
Do we need reimplement dm_exception_table_lock?
Any suggestions or assistance would be appreciated.
[ 862.410151] Kernel panic - not syncing: scheduling while atomic: panic_on_warn set ...
[ 862.580196] CPU: 2 UID: 0 PID: 349 Comm: kworker/u72:11 Kdump: loaded Tainted: G O 6.12.0-1-rt-amd64 #1 Debian 6.12.40-1.stx.130
[ 862.593223] Tainted: [O]=OOT_MODULE
[ 862.596714] Hardware name: Dell Inc. PowerEdge R740xd/00WGD1, BIOS 2.24.0 03/27/2025
[ 862.604453] Workqueue: writeback wb_workfn (flush-253:21)
[ 862.609852] Call Trace:
[ 862.612306] <TASK>
[ 862.614411] panic+0x34a/0x370
[ 862.617470] check_panic_on_warn+0x50/0x50
[ 862.621569] __schedule_bug+0x4d/0x60
[ 862.625236] __schedule+0xa0c/0xbb0
[ 862.628729] schedule_rtlock+0x1a/0x30
[ 862.632481] rtlock_slowlock_locked+0x20b/0xcc0
[ 862.637014] rt_spin_lock+0x40/0x60
[ 862.640506] __insert_pending_exception+0x4e/0xe0 [dm_snapshot]
[ 862.646424] __origin_write+0x2fb/0x360 [dm_snapshot]
[ 862.651477] do_origin+0xd5/0xe0 [dm_snapshot]
[ 862.655923] __map_bio+0x17c/0x1b0 [dm_mod]
[ 862.660117] dm_submit_bio+0x1ad/0x5a0 [dm_mod]
[ 862.664649] __submit_bio+0x144/0x240
[ 862.668315] ? __submit_bio+0xc1/0x240
[ 862.672067] submit_bio_noacct_nocheck+0x19a/0x3c0
[ 862.676860] iomap_submit_ioend+0x42/0x80
[ 862.680873] iomap_writepages+0x5f8/0x8d0
[ 862.684886] xfs_vm_writepages+0x62/0x90 [xfs]
[ 862.689473] do_writepages+0xcc/0x240
[ 862.693136] __writeback_single_inode+0x41/0x330
[ 862.697756] writeback_sb_inodes+0x21c/0x4d0
[ 862.702028] wb_writeback+0x7c/0x2f0
[ 862.705607] wb_workfn+0xc1/0x450
[ 862.708926] process_one_work+0x179/0x390
[ 862.712940] worker_thread+0x237/0x340
[ 862.716691] ? __pfx_worker_thread+0x10/0x10
[ 862.720964] kthread+0xc6/0x100
[ 862.724111] ? __pfx_kthread+0x10/0x10
[ 862.727863] ret_from_fork+0x2d/0x50
[ 862.731441] ? __pfx_kthread+0x10/0x10
[ 862.735193] ret_from_fork_asm+0x1a/0x30
[ 862.739122] </TASK>
and
[ 36.563812] BUG: scheduling while atomic: lvm/1380/0x00000003
......
[ 36.563841] CPU: 32 PID: 1380 Comm: lvm Tainted: G O 6.6.0-1-rt-amd64 #1 Debian 6.6.71-1.stx.104
[ 36.563844] Hardware name: ZTSYSTEMS Galene EI/Galene, BIOS 1.01 12/07/2023
[ 36.563845] Call Trace:
[ 36.563848] <TASK>
[ 36.563849] dump_stack_lvl+0x37/0x50
[ 36.563855] __schedule_bug+0x52/0x60
[ 36.563859] __schedule+0x87d/0xb10
[ 36.563861] ? update_load_avg+0x7e/0x750
[ 36.563865] schedule_rtlock+0x1f/0x40
[ 36.563866] rtlock_slowlock_locked+0x232/0xd40
[ 36.563870] ? __set_cpus_allowed_ptr+0x55/0xa0
[ 36.563873] ? dm_add_exception+0xb4/0xf0 [dm_snapshot]
[ 36.563879] rt_spin_lock+0x45/0x60
[ 36.563881] kmem_cache_free+0x182/0x480
[ 36.563884] dm_add_exception+0xb4/0xf0 [dm_snapshot]
[ 36.563889] persistent_read_metadata+0x29d/0x550 [dm_snapshot]
[ 36.563895] ? __pfx_dm_add_exception+0x10/0x10 [dm_snapshot]
[ 36.563900] snapshot_ctr+0x60b/0x8f0 [dm_snapshot]
[ 36.563905] dm_table_add_target+0x246/0x3b0 [dm_mod]
[ 36.563919] table_load+0x136/0x4b0 [dm_mod]
[ 36.563930] ? __pfx_table_load+0x10/0x10 [dm_mod]
[ 36.563940] ctl_ioctl+0x1b3/0x500 [dm_mod]
[ 36.563950] dm_ctl_ioctl+0xe/0x20 [dm_mod]
[ 36.563960] __x64_sys_ioctl+0x8f/0xd0
[ 36.563964] do_syscall_64+0x58/0xb0
[ 36.563967] ? dm_ctl_ioctl+0xe/0x20 [dm_mod]
[ 36.563976] ? __ct_user_enter+0x2f/0xd0
[ 36.563978] ? syscall_exit_to_user_mode+0x32/0x40
[ 36.563980] ? do_syscall_64+0x65/0xb0
[ 36.563983] ? exit_to_user_mode_prepare+0xa9/0x190
[ 36.563985] ? __ct_user_enter+0x2f/0xd0
[ 36.563987] ? syscall_exit_to_user_mode+0x32/0x40
Thanks,
Jiping