Re: [PATCH v2 2/3] md/raid1,raid10: fix error-path detection with md_cloned_bio()

From: Yu Kuai

Date: Thu May 21 2026 - 03:34:26 EST


Hi,

在 2026/5/21 15:08, Xiao Ni 写道:
> On Thu, May 21, 2026 at 9:25 AM Yu Kuai <yukuai@xxxxxxxxx> wrote:
>> Hi,
>>
>> 在 2026/5/1 19:46, Abd-Alrhman Masalkhi 写道:
>>> Detect the error path using md_cloned_bio() instead of relying
>>> on r1_bio in raid1 or r10_bio->read_slot in raid10, which may be
>>> NULL or -1 after splitting and resubmitting a failed bio.
>>>
>>> As a result, the error path may not be recognized and memory
>>> allocations can incorrectly use GFP_NOIO instead of
>>> (GFP_NOIO | __GFP_HIGH), which can lead to a deadlock under
>>> memory pressure.
>> I don't think this patch will fix this problem. Because split bio
>> is a new bio that will be resubmit by md_submit_bio(), such bio is
>> not a md_cloned_bio yet until md_account_bio().
> Hi Kuai
>
> bio = bio_submit_split_bioset(bio, max_sectors,
> &conf->bio_split);
> bio_submit_split_bioset returns the split bio and the remainder bio
> will be submitted again which is already a md_cloned_bio.

Okay, I understand now this is a special case that current->bio_list is
NULL. However, the problem still exist in the case original IO split again
while resubmitting, and in this case current->bio_list will be set so new
split io will be resubmit by md_submit_bio(). Which means patch 1 can't
fix this case as well.

Perhaps it's better to check task_struct instead of bio_set.

>
> Regards
> Xiao
>>> Fixes: 689389a06ce7 ("md/raid1: simplify handle_read_error().")
>>> Fixes: 545250f24809 ("md/raid10: simplify handle_read_error()")
>>> Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@xxxxxxxxx>
>>> ---
>>> This patch depends on patch 1.
>>>
>>> Changes in v2:
>>> - New patch.
>>> ---
>>> drivers/md/raid1.c | 13 ++++++++++---
>>> drivers/md/raid10.c | 20 ++++++++++++++------
>>> 2 files changed, 24 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>>> index cc9914bd15c1..c52ecd38c163 100644
>>> --- a/drivers/md/raid1.c
>>> +++ b/drivers/md/raid1.c
>>> @@ -1321,11 +1321,18 @@ static void raid1_read_request(struct mddev *mddev, struct bio *bio,
>>> bool r1bio_existed = !!r1_bio;
>>>
>>> /*
>>> - * If r1_bio is set, we are blocking the raid1d thread
>>> - * so there is a tiny risk of deadlock. So ask for
>>> + * An md cloned bio indicates we are in the error path.
>>> + * This is more reliable than checking r1_bio, which might
>>> + * be NULL even in the error path if a failed bio was split.
>>> + */
>>> + bool err_path = md_cloned_bio(mddev, bio);
>>> +
>>> + /*
>>> + * If we are in the error path, we are blocking the raid1d
>>> + * thread so there is a tiny risk of deadlock. So ask for
>>> * emergency memory if needed.
>>> */
>>> - gfp_t gfp = r1_bio ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
>>> + gfp_t gfp = err_path ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
>>>
>>> /*
>>> * Still need barrier for READ in case that whole
>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>> index 3a591e60a144..8c6fc398260e 100644
>>> --- a/drivers/md/raid10.c
>>> +++ b/drivers/md/raid10.c
>>> @@ -1155,7 +1155,20 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
>>> char b[BDEVNAME_SIZE];
>>> int slot = r10_bio->read_slot;
>>> struct md_rdev *err_rdev = NULL;
>>> - gfp_t gfp = GFP_NOIO;
>>> +
>>> + /*
>>> + * An md cloned bio indicates we are in the error path.
>>> + * This is more reliable than checking slot, which might
>>> + * be -1 even in the error path if a failed bio was split.
>>> + */
>>> + bool err_path = md_cloned_bio(mddev, bio);
>>> +
>>> + /*
>>> + * If we are in the error path, we are blocking the raid10d
>>> + * thread so there is a tiny risk of deadlock. So ask for
>>> + * emergency memory if needed.
>>> + */
>>> + gfp_t gfp = err_path ? (GFP_NOIO | __GFP_HIGH) : GFP_NOIO;
>>>
>>> if (slot >= 0 && r10_bio->devs[slot].rdev) {
>>> /*
>>> @@ -1166,11 +1179,6 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
>>> * we lose the device name in error messages.
>>> */
>>> int disk;
>>> - /*
>>> - * As we are blocking raid10, it is a little safer to
>>> - * use __GFP_HIGH.
>>> - */
>>> - gfp = GFP_NOIO | __GFP_HIGH;
>>>
>>> disk = r10_bio->devs[slot].devnum;
>>> err_rdev = conf->mirrors[disk].rdev;
>> --
>> Thansk,
>> Kuai
>>
--
Thansk,
Kuai