Re: WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe (bio_associate_blkg_from_css+0x3e5/0x650 pool_map+0x23/0x70)

From: Ming Lei
Date: Tue Aug 03 2021 - 03:13:04 EST


On Mon, Aug 02, 2021 at 05:11:08PM +0200, Bruno Goncalves wrote:
> Hello,
>
> We've hit the issue below twice during our tests on kernel 5.14.0-rc3.
> Unfortunately, we don't have a reliable reproducer.
>
> [ 496.176739] =====================================================
> [ 496.182844] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
> [ 496.189471] 5.14.0-rc3 #1 Not tainted
> [ 496.193152] -----------------------------------------------------
> [ 496.199252] systemd-udevd/12979 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> [ 496.206399] ffff995b358dc5a0 (&q->queue_lock){....}-{2:2}, at:
> bio_associate_blkg_from_css+0x3e5/0x650
> [ 496.215726]
> and this task is already holding:
> [ 496.221563] ffff995b081d0ad8 (&pool->lock#3){..-.}-{2:2}, at:
> pool_map+0x23/0x70 [dm_thin_pool]
> [ 496.230282] which would create a new lock dependency:
> [ 496.235344] (&pool->lock#3){..-.}-{2:2} -> (&q->queue_lock){....}-{2:2}
> [ 496.242059]
> but this new dependency connects a SOFTIRQ-irq-safe lock:
> [ 496.249988] (&pool->lock#3){..-.}-{2:2}
> [ 496.249993]
> ... which became SOFTIRQ-irq-safe at:
> [ 496.260111] lock_acquire+0xb5/0x2b0
> [ 496.263785] _raw_spin_lock_irqsave+0x48/0x60
> [ 496.268240] overwrite_endio+0x46/0x70 [dm_thin_pool]
> [ 496.273393] clone_endio+0xb9/0x1e0
> [ 496.276979] clone_endio+0xb9/0x1e0
> [ 496.280565] blk_update_request+0x25b/0x420
> [ 496.284846] blk_mq_end_request+0x1c/0x130
> [ 496.289041] blk_complete_reqs+0x37/0x40
> [ 496.293071] __do_softirq+0xde/0x485
> [ 496.296744] run_ksoftirqd+0x3a/0x70
> [ 496.300427] smpboot_thread_fn+0xf2/0x1c0
> [ 496.304534] kthread+0x143/0x160
> [ 496.307863] ret_from_fork+0x22/0x30
> [ 496.311538]
> to a SOFTIRQ-irq-unsafe lock:
> [ 496.317037] (&blkcg->lock){+.+.}-{2:2}
> [ 496.317041]
> ... which became SOFTIRQ-irq-unsafe at:
> [ 496.327243] ...
> [ 496.327244] lock_acquire+0xb5/0x2b0
> [ 496.332680] _raw_spin_lock+0x2c/0x40
> [ 496.336443] ioc_weight_write+0x153/0x260
> [ 496.340551] kernfs_fop_write_iter+0x134/0x1d0
> [ 496.345095] new_sync_write+0x10b/0x180
> [ 496.349030] vfs_write+0x26a/0x380
> [ 496.352530] ksys_write+0x58/0xd0
> [ 496.355944] do_syscall_64+0x5c/0x80
> [ 496.359616] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 496.364776]
>
> More logs are available on
> https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2021/07/29/345042572/build_x86_64_redhat%3A1463074774/tests/lvm_snapper_test/

The following patch should fix the warning:

https://lore.kernel.org/linux-block/20210803070608.1766400-1-ming.lei@xxxxxxxxxx/T/#u


Thanks,
Ming