6.13/regression/bisected - after commit f1be1788a32e I see in the kernel log "possible circular locking dependency detected"

From: Mikhail Gavrilov
Date: Tue Dec 17 2024 - 20:52:10 EST

Next message: Wei Fang: "RE: [PATCH v8 net-next 1/4] net: enetc: add Tx checksum offload for i.MX95 ENETC"
Previous message: Steven Rostedt: "[for-linus][PATCH 4/4] tracing: Check "%s" dereference via the field and not the TP_printk format"
Next in thread: Ming Lei: "Re: 6.13/regression/bisected - after commit f1be1788a32e I see in the kernel log "possible circular locking dependency detected""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,
After commit f1be1788a32e I see in the kernel log "possible circular
locking dependency detected" with follow stack trace:
[ 740.877178] ======================================================
[ 740.877180] WARNING: possible circular locking dependency detected
[ 740.877182] 6.13.0-rc3-f44d154d6e3d+ #392 Tainted: G W L
[ 740.877184] ------------------------------------------------------
[ 740.877186] btrfs-transacti/839 is trying to acquire lock:
[ 740.877188] ffff888182336a50
(&q->q_usage_counter(io)#2){++++}-{0:0}, at: __submit_bio+0x335/0x520
[ 740.877197]
but task is already holding lock:
[ 740.877198] ffff8881826f7048 (btrfs-tree-00){++++}-{4:4}, at:
btrfs_tree_read_lock_nested+0x27/0x170
[ 740.877205]
which lock already depends on the new lock.

[ 740.877206]
the existing dependency chain (in reverse order) is:
[ 740.877207]
-> #4 (btrfs-tree-00){++++}-{4:4}:
[ 740.877211] lock_release+0x397/0xd90
[ 740.877215] up_read+0x1b/0x30
[ 740.877217] btrfs_search_slot+0x16c9/0x31f0
[ 740.877220] btrfs_lookup_inode+0xa9/0x360
[ 740.877222] __btrfs_update_delayed_inode+0x131/0x760
[ 740.877225] btrfs_async_run_delayed_root+0x4bc/0x630
[ 740.877226] btrfs_work_helper+0x1b5/0xa50
[ 740.877228] process_one_work+0x899/0x14b0
[ 740.877231] worker_thread+0x5e6/0xfc0
[ 740.877233] kthread+0x2d2/0x3a0
[ 740.877235] ret_from_fork+0x31/0x70
[ 740.877238] ret_from_fork_asm+0x1a/0x30
[ 740.877240]
-> #3 (&delayed_node->mutex){+.+.}-{4:4}:
[ 740.877244] __mutex_lock+0x1ab/0x12c0
[ 740.877247] __btrfs_release_delayed_node.part.0+0xa0/0xd40
[ 740.877249] btrfs_evict_inode+0x44d/0xc20
[ 740.877252] evict+0x3a4/0x840
[ 740.877255] dispose_list+0xf0/0x1c0
[ 740.877257] prune_icache_sb+0xe3/0x160
[ 740.877259] super_cache_scan+0x30d/0x4f0
[ 740.877261] do_shrink_slab+0x349/0xd60
[ 740.877264] shrink_slab+0x7a4/0xd20
[ 740.877266] shrink_one+0x403/0x830
[ 740.877268] shrink_node+0x2337/0x3a60
[ 740.877270] balance_pgdat+0xa4f/0x1500
[ 740.877272] kswapd+0x4f3/0x940
[ 740.877274] kthread+0x2d2/0x3a0
[ 740.877276] ret_from_fork+0x31/0x70
[ 740.877278] ret_from_fork_asm+0x1a/0x30
[ 740.877280]
-> #2 (fs_reclaim){+.+.}-{0:0}:
[ 740.877283] fs_reclaim_acquire+0xc9/0x110
[ 740.877286] __kmalloc_noprof+0xeb/0x690
[ 740.877288] sd_revalidate_disk.isra.0+0x4356/0x8e00
[ 740.877291] sd_probe+0x869/0xfa0
[ 740.877293] really_probe+0x1e0/0x8a0
[ 740.877295] __driver_probe_device+0x18c/0x370
[ 740.877297] driver_probe_device+0x4a/0x120
[ 740.877299] __device_attach_driver+0x162/0x270
[ 740.877300] bus_for_each_drv+0x115/0x1a0
[ 740.877303] __device_attach_async_helper+0x1a0/0x240
[ 740.877305] async_run_entry_fn+0x97/0x4f0
[ 740.877307] process_one_work+0x899/0x14b0
[ 740.877309] worker_thread+0x5e6/0xfc0
[ 740.877310] kthread+0x2d2/0x3a0
[ 740.877312] ret_from_fork+0x31/0x70
[ 740.877314] ret_from_fork_asm+0x1a/0x30
[ 740.877316]
-> #1 (&q->limits_lock){+.+.}-{4:4}:
[ 740.877320] __mutex_lock+0x1ab/0x12c0
[ 740.877321] nvme_update_ns_info_block+0x476/0x2630 [nvme_core]
[ 740.877332] nvme_update_ns_info+0xbe/0xa60 [nvme_core]
[ 740.877339] nvme_alloc_ns+0x1589/0x2c40 [nvme_core]
[ 740.877346] nvme_scan_ns+0x579/0x660 [nvme_core]
[ 740.877353] async_run_entry_fn+0x97/0x4f0
[ 740.877355] process_one_work+0x899/0x14b0
[ 740.877357] worker_thread+0x5e6/0xfc0
[ 740.877358] kthread+0x2d2/0x3a0
[ 740.877360] ret_from_fork+0x31/0x70
[ 740.877362] ret_from_fork_asm+0x1a/0x30
[ 740.877364]
-> #0 (&q->q_usage_counter(io)#2){++++}-{0:0}:
[ 740.877368] __lock_acquire+0x3216/0x6680
[ 740.877370] lock_acquire+0x1ae/0x560
[ 740.877372] blk_mq_submit_bio+0x184e/0x1ea0
[ 740.877374] __submit_bio+0x335/0x520
[ 740.877376] submit_bio_noacct_nocheck+0x546/0xca0
[ 740.877377] btrfs_submit_chunk+0x4be/0x1770
[ 740.877380] btrfs_submit_bbio+0x37/0x80
[ 740.877381] read_extent_buffer_pages+0x488/0x750
[ 740.877384] btrfs_read_extent_buffer+0x13e/0x6a0
[ 740.877385] read_block_for_search+0x4f6/0x9a0
[ 740.877388] btrfs_search_slot+0x85b/0x31f0
[ 740.877390] btrfs_lookup_inode+0xa9/0x360
[ 740.877391] __btrfs_update_delayed_inode+0x131/0x760
[ 740.877393] __btrfs_run_delayed_items+0x397/0x590
[ 740.877395] btrfs_commit_transaction+0x462/0x2f00
[ 740.877397] transaction_kthread+0x2cc/0x400
[ 740.877399] kthread+0x2d2/0x3a0
[ 740.877401] ret_from_fork+0x31/0x70
[ 740.877403] ret_from_fork_asm+0x1a/0x30
[ 740.877405]
other info that might help us debug this:

[ 740.877406] Chain exists of:
&q->q_usage_counter(io)#2 --> &delayed_node->mutex
--> btrfs-tree-00

[ 740.877412] Possible unsafe locking scenario:

[ 740.877413] CPU0 CPU1
[ 740.877414] ---- ----
[ 740.877415] rlock(btrfs-tree-00);
[ 740.877417] lock(&delayed_node->mutex);
[ 740.877419] lock(btrfs-tree-00);
[ 740.877421] rlock(&q->q_usage_counter(io)#2);
[ 740.877424]
*** DEADLOCK ***

[ 740.877425] 5 locks held by btrfs-transacti/839:
[ 740.877427] #0: ffff8881e7a94860
(&fs_info->transaction_kthread_mutex){+.+.}-{4:4}, at:
transaction_kthread+0x108/0x400
[ 740.877433] #1: ffff8881e7a96790
(btrfs_trans_num_writers){++++}-{0:0}, at:
join_transaction+0x391/0xf30
[ 740.877438] #2: ffff8881e7a967c0
(btrfs_trans_num_extwriters){++++}-{0:0}, at:
join_transaction+0x391/0xf30
[ 740.877443] #3: ffff888165e1e240
(&delayed_node->mutex){+.+.}-{4:4}, at:
__btrfs_run_delayed_items+0x352/0x590
[ 740.877448] #4: ffff8881826f7048 (btrfs-tree-00){++++}-{4:4}, at:
btrfs_tree_read_lock_nested+0x27/0x170
[ 740.877453]
stack backtrace:
[ 740.877455] CPU: 2 UID: 0 PID: 839 Comm: btrfs-transacti Tainted: G
W L 6.13.0-rc3-f44d154d6e3d+ #392
[ 740.877458] Tainted: [W]=WARN, [L]=SOFTLOCKUP
[ 740.877460] Hardware name: ASUS System Product Name/ROG STRIX
B650E-I GAMING WIFI, BIOS 3040 09/12/2024
[ 740.877462] Call Trace:
[ 740.877464] <TASK>
[ 740.877466] dump_stack_lvl+0x84/0xd0
[ 740.877469] print_circular_bug.cold+0x1e0/0x274
[ 740.877472] check_noncircular+0x308/0x3f0
[ 740.877475] ? __pfx_check_noncircular+0x10/0x10
[ 740.877477] ? mark_lock+0xf5/0x1730
[ 740.877480] ? lockdep_lock+0xca/0x1c0
[ 740.877482] ? __pfx_lockdep_lock+0x10/0x10
[ 740.877485] __lock_acquire+0x3216/0x6680
[ 740.877488] ? __pfx___lock_acquire+0x10/0x10
[ 740.877491] ? mark_lock+0xf5/0x1730
[ 740.877493] ? mark_lock+0xf5/0x1730
[ 740.877495] lock_acquire+0x1ae/0x560
[ 740.877498] ? __submit_bio+0x335/0x520
[ 740.877500] ? __pfx_lock_acquire+0x10/0x10
[ 740.877502] ? __pfx_lock_release+0x10/0x10
[ 740.877506] blk_mq_submit_bio+0x184e/0x1ea0
[ 740.877508] ? __submit_bio+0x335/0x520
[ 740.877510] ? __pfx_blk_mq_submit_bio+0x10/0x10
[ 740.877512] ? __pfx___lock_acquire+0x10/0x10
[ 740.877515] ? __pfx_mark_lock+0x10/0x10
[ 740.877518] __submit_bio+0x335/0x520
[ 740.877520] ? __pfx___submit_bio+0x10/0x10
[ 740.877522] ? __pfx_lock_release+0x10/0x10
[ 740.877525] ? seqcount_lockdep_reader_access.constprop.0+0xa5/0xb0
[ 740.877528] ? lockdep_hardirqs_on+0x7c/0x100
[ 740.877531] ? submit_bio_noacct_nocheck+0x546/0xca0
[ 740.877533] submit_bio_noacct_nocheck+0x546/0xca0
[ 740.877535] ? __pfx___might_resched+0x10/0x10
[ 740.877538] ? __pfx_submit_bio_noacct_nocheck+0x10/0x10
[ 740.877541] ? submit_bio_noacct+0x134f/0x1ae0
[ 740.877543] btrfs_submit_chunk+0x4be/0x1770
[ 740.877546] ? __pfx_btrfs_submit_chunk+0x10/0x10
[ 740.877548] ? bio_init+0x398/0x620
[ 740.877551] ? bvec_alloc+0xd7/0x1b0
[ 740.877553] ? bio_alloc_bioset+0x3fb/0x700
[ 740.877556] ? __pfx_bio_add_folio+0x10/0x10
[ 740.877558] ? kasan_print_aux_stacks+0x20/0x30
[ 740.877561] ? __pfx_end_bbio_meta_read+0x10/0x10
[ 740.877564] btrfs_submit_bbio+0x37/0x80
[ 740.877566] read_extent_buffer_pages+0x488/0x750
[ 740.877568] ? __pfx_free_extent_buffer+0x10/0x10
[ 740.877571] btrfs_read_extent_buffer+0x13e/0x6a0
[ 740.877573] ? btrfs_release_path+0x55/0x2c0
[ 740.877576] ? rcu_is_watching+0x12/0xc0
[ 740.877579] read_block_for_search+0x4f6/0x9a0
[ 740.877582] ? __pfx_read_block_for_search+0x10/0x10
[ 740.877586] btrfs_search_slot+0x85b/0x31f0
[ 740.877590] ? lock_acquire+0x1ae/0x560
[ 740.877592] ? __pfx_btrfs_search_slot+0x10/0x10
[ 740.877594] ? find_held_lock+0x34/0x120
[ 740.877596] ? __btrfs_run_delayed_items+0x352/0x590
[ 740.877598] ? __pfx_lock_acquired+0x10/0x10
[ 740.877601] btrfs_lookup_inode+0xa9/0x360
[ 740.877603] ? __pfx_btrfs_lookup_inode+0x10/0x10
[ 740.877605] ? __mutex_lock+0x46f/0x12c0
[ 740.877607] ? __btrfs_run_delayed_items+0x320/0x590
[ 740.877610] __btrfs_update_delayed_inode+0x131/0x760
[ 740.877612] ? bit_wait_timeout+0xe0/0x170
[ 740.877615] ? __pfx___btrfs_update_delayed_inode+0x10/0x10
[ 740.877617] ? __btrfs_release_delayed_node.part.0+0x2d1/0xd40
[ 740.877620] __btrfs_run_delayed_items+0x397/0x590
[ 740.877624] btrfs_commit_transaction+0x462/0x2f00
[ 740.877628] ? __raw_spin_lock_init+0x3f/0x110
[ 740.877631] ? __pfx_btrfs_commit_transaction+0x10/0x10
[ 740.877633] ? start_transaction+0x542/0x15f0
[ 740.877637] transaction_kthread+0x2cc/0x400
[ 740.877639] ? __pfx_warn_bogus_irq_restore+0x10/0x10
[ 740.877643] ? __pfx_transaction_kthread+0x10/0x10
[ 740.877645] kthread+0x2d2/0x3a0
[ 740.877647] ? _raw_spin_unlock_irq+0x28/0x60
[ 740.877650] ? __pfx_kthread+0x10/0x10
[ 740.877652] ret_from_fork+0x31/0x70
[ 740.877654] ? __pfx_kthread+0x10/0x10
[ 740.877656] ret_from_fork_asm+0x1a/0x30
[ 740.877660] </TASK>

And this happens usually every time when I update docker container.

Git bisect said that the first bad commit is:
f1be1788a32e8fa63416ad4518bbd1a85a825c9d
Author: Ming Lei <ming.lei@xxxxxxxxxx>
Date: Fri Oct 25 08:37:20 2024 +0800

block: model freeze & enter queue as lock for supporting lockdep

Recently we got several deadlock report[1][2][3] caused by
blk_mq_freeze_queue and blk_enter_queue().

Turns out the two are just like acquiring read/write lock, so model them
as read/write lock for supporting lockdep:

1) model q->q_usage_counter as two locks(io and queue lock)

- queue lock covers sync with blk_enter_queue()

- io lock covers sync with bio_enter_queue()

2) make the lockdep class/key as per-queue:

- different subsystem has very different lock use pattern, shared lock
class causes false positive easily

- freeze_queue degrades to no lock in case that disk state becomes DEAD
because bio_enter_queue() won't be blocked any more

- freeze_queue degrades to no lock in case that request queue becomes dying
because blk_enter_queue() won't be blocked any more

3) model blk_mq_freeze_queue() as acquire_exclusive & try_lock
- it is exclusive lock, so dependency with blk_enter_queue() is covered

- it is trylock because blk_mq_freeze_queue() are allowed to run
concurrently

4) model blk_enter_queue() & bio_enter_queue() as acquire_read()
- nested blk_enter_queue() are allowed

- dependency with blk_mq_freeze_queue() is covered

- blk_queue_exit() is often called from other contexts(such as irq), and
it can't be annotated as lock_release(), so simply do it in
blk_enter_queue(), this way still covered cases as many as possible

With lockdep support, such kind of reports may be reported asap and
needn't wait until the real deadlock is triggered.

For example, lockdep report can be triggered in the report[3] with this
patch applied.

[1] occasional block layer hang when setting 'echo noop >
/sys/block/sda/queue/scheduler'
https://bugzilla.kernel.org/show_bug.cgi?id=219166

[2] del_gendisk() vs blk_queue_enter() race condition
https://lore.kernel.org/linux-block/20241003085610.GK11458@xxxxxxxxxx/

[3] queue_freeze & queue_enter deadlock in scsi
https://lore.kernel.org/linux-block/ZxG38G9BuFdBpBHZ@fedora/T/#u

Reviewed-by: Christoph Hellwig <hch@xxxxxx>
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
Link: https://lore.kernel.org/r/20241025003722.3630252-4-ming.lei@xxxxxxxxxx
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>

block/blk-core.c | 18 ++++++++++++++++--
block/blk-mq.c | 26 ++++++++++++++++++++++----
block/blk.h | 29 ++++++++++++++++++++++++++---
block/genhd.c | 15 +++++++++++----
include/linux/blkdev.h | 6 ++++++
5 files changed, 81 insertions(+), 13 deletions(-)

Unfortunately I can't check revert commit f1be1788a32e because of conflicts:
> git revert -n f1be1788a32e8fa63416ad4518bbd1a85a825c9d
Auto-merging block/blk-core.c
CONFLICT (content): Merge conflict in block/blk-core.c
Auto-merging block/blk-mq.c
CONFLICT (content): Merge conflict in block/blk-mq.c
Auto-merging block/blk.h
CONFLICT (content): Merge conflict in block/blk.h
Auto-merging block/genhd.c
CONFLICT (content): Merge conflict in block/genhd.c
Auto-merging include/linux/blkdev.h
error: could not revert f1be1788a32e... block: model freeze & enter
queue as lock for supporting lockdep
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: Disable this message with "git config advice.mergeConflict false"

Machine spec: https://linux-hardware.org/?probe=1a9ffb9719
I attached below full kernel log and build config.

Ming, you as author first bad commit can you look into this please?

--
Best Regards,
Mike Gavrilov.

Attachment: .config.zip
Description: Zip archive

Attachment: dmesg-6.13.0-rc3-f44d154d6e3d.zip
Description: Zip archive

Next message: Wei Fang: "RE: [PATCH v8 net-next 1/4] net: enetc: add Tx checksum offload for i.MX95 ENETC"
Previous message: Steven Rostedt: "[for-linus][PATCH 4/4] tracing: Check "%s" dereference via the field and not the TP_printk format"
Next in thread: Ming Lei: "Re: 6.13/regression/bisected - after commit f1be1788a32e I see in the kernel log "possible circular locking dependency detected""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]