[PATCH v2] scsi: use GFP_NOIO to avoid circular locking dependency

From: Rik van Riel
Date: Wed Jan 29 2025 - 10:47:03 EST


On Tue, 28 Jan 2025 21:35:18 -0800
Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:

> GFP_NOFS is never the right thing for block layer allocations.
> The right thing here is GFP_NOIO which is a superset of GFP_NOFS.
> Otherwise you could reproduce the same deadlock when using swap
> instead of a file system to reproduce basically the same deadlock.

Duh, you are right of course!

The fixed up patch with GFP_NOIO is below.

---8<---

From 74272b4537415fd7d94c216e422510c27aa88fa0 Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel@xxxxxxxxxxx>
Date: Tue, 28 Jan 2025 16:35:39 -0500
Subject: [PATCH] scsi: use GFP_NOIO to avoid circular locking dependency
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Filesystems can write to disk from page reclaim with __GFP_FS
set. Marc found a case where scsi_realloc_sdev_budget_map
ends up in page reclaim with GFP_KERNEL, where it could try
to take filesystem locks again, leading to a deadlock.

WARNING: possible circular locking dependency detected
6.13.0 #1 Not tainted
------------------------------------------------------
kswapd0/70 is trying to acquire lock:
ffff8881025d5d78 (&q->q_usage_counter(io)){++++}-{0:0}, at: blk_mq_submit_bio+0x461/0x6e0

but task is already holding lock:
ffffffff81ef5f40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x9f/0x760

The full lockdep splat can be found in Marc's report:

https://lkml.org/lkml/2025/1/24/1101

Avoid the potential deadlock by doing the allocation with GFP_NOIO,
which prevents both filesystem and block layer recursion.

Reported-by: Marc Aurèle La France <tsi@xxxxxxxxxx>
Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx>
---
drivers/scsi/scsi_scan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index f2093982b3db..b0964b6dd646 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -245,7 +245,7 @@ static int scsi_realloc_sdev_budget_map(struct scsi_device *sdev,
}
ret = sbitmap_init_node(&sdev->budget_map,
scsi_device_max_queue_depth(sdev),
- new_shift, GFP_KERNEL,
+ new_shift, GFP_NOIO,
sdev->request_queue->node, false, true);
if (!ret)
sbitmap_resize(&sdev->budget_map, depth);
--
2.47.1