[PATCH v2 2/3] md/raid1,raid10: fix error-path detection with md_cloned_bio()
From: Abd-Alrhman Masalkhi
Date: Fri May 01 2026 - 07:47:34 EST
Detect the error path using md_cloned_bio() instead of relying
on r1_bio in raid1 or r10_bio->read_slot in raid10, which may be
NULL or -1 after splitting and resubmitting a failed bio.
As a result, the error path may not be recognized and memory
allocations can incorrectly use GFP_NOIO instead of
(GFP_NOIO | __GFP_HIGH), which can lead to a deadlock under
memory pressure.
Fixes: 689389a06ce7 ("md/raid1: simplify handle_read_error().")
Fixes: 545250f24809 ("md/raid10: simplify handle_read_error()")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@xxxxxxxxx>
---
This patch depends on patch 1.
Changes in v2:
- New patch.
---
drivers/md/raid1.c | 13 ++++++++++---
drivers/md/raid10.c | 20 ++++++++++++++------
2 files changed, 24 insertions(+), 9 deletions(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index cc9914bd15c1..c52ecd38c163 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1321,11 +1321,18 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
bool r1bio_existed = !!r1_bio;
/*
- * If r1_bio is set, we are blocking the raid1d thread
- * so there is a tiny risk of deadlock. So ask for
+ * An md cloned bio indicates we are in the error path.
+ * This is more reliable than checking r1_bio, which might
+ * be NULL even in the error path if a failed bio was split.
+ */
+ bool err_path = md_cloned_bio(mddev, bio);
+
+ /*
+ * If we are in the error path, we are blocking the raid1d
+ * thread so there is a tiny risk of deadlock. So ask for
* emergency memory if needed.
*/
- gfp_t gfp = r1_bio ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
+ gfp_t gfp = err_path ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
/*
* Still need barrier for READ in case that whole
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 3a591e60a144..8c6fc398260e 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1155,7 +1155,20 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
char b[BDEVNAME_SIZE];
int slot = r10_bio->read_slot;
struct md_rdev *err_rdev = NULL;
- gfp_t gfp = GFP_NOIO;
+
+ /*
+ * An md cloned bio indicates we are in the error path.
+ * This is more reliable than checking slot, which might
+ * be -1 even in the error path if a failed bio was split.
+ */
+ bool err_path = md_cloned_bio(mddev, bio);
+
+ /*
+ * If we are in the error path, we are blocking the raid10d
+ * thread so there is a tiny risk of deadlock. So ask for
+ * emergency memory if needed.
+ */
+ gfp_t gfp = err_path ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
if (slot >= 0 && r10_bio->devs[slot].rdev) {
/*
@@ -1166,11 +1179,6 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
* we lose the device name in error messages.
*/
int disk;
- /*
- * As we are blocking raid10, it is a little safer to
- * use __GFP_HIGH.
- */
- gfp = GFP_NOIO | __GFP_HIGH;
disk = r10_bio->devs[slot].devnum;
err_rdev = conf->mirrors[disk].rdev;
--
2.43.0