Re: [PATCH] md/raid5: fix race between reshape and chunk-aligned read
From: Yu Kuai
Date: Tue Apr 28 2026 - 04:35:45 EST
Hi,
在 2026/4/21 15:09, FengWei Shih 写道:
> Hi Kuai,
>
> Yu Kuai 於 2026/4/19 下午 01:14 寫道:
>> Hi,
>>
>> 在 2026/4/9 13:17, FengWei Shih 写道:
>>> raid5_make_request() checks mddev->reshape_position to decide whether
>>> to allow chunk-aligned reads. However in raid5_start_reshape(), the
>>> layout configuration (raid_disks, algorithm, etc.) is updated before
>>> mddev->reshape_position is set:
>>>
>>> reshape (raid5_start_reshape) read (raid5_make_request)
>>> ============================== ===========================
>>> write_seqcount_begin
>>> update raid_disks, algorithm...
>>> set conf->reshape_progress
>>> write_seqcount_end
>>> check mddev->reshape_position
>>> * still MaxSector, allow
>>> raid5_read_one_chunk()
>>> * use new layout
>>> raid5_quiesce()
>>> set mddev->reshape_position
>> While we're here, I think it's pretty ugly to disable
>> raid5_read_one_chunk
>> when reshape is not fully done. So a better solution should be:
>>
>> - data behind reshape: read with new layout
>> - data ahead of reshape: read with old layout, reshape will also need
>> to wait
>> for this IO to be done, before reshape can make progress.
>> - data intersecting the active reshape window: wait for reshape to
>> make progress.
>
> Thanks for the feedback. I agree that using reshape_progress to
> distinguish
> the three cases (ahead/behind/inside reshape) would be more refined.
>
> But disabling chunk-aligned reads during reshape was already the
> existing design.
> In the original code, the check is at the caller level:
>
> if (rw == READ && mddev->degraded == 0 &&
> mddev->reshape_position == MaxSector) {
> bi = chunk_aligned_read(mddev, bi);
> }
>
> My patch is focused on fixing the race condition in the existing
> lockless check of whether
> reshape is in progress. So just to confirm: are you suggesting we add
> a mechanism to
> allow chunk-aligned reads during reshape (based on reshape progress),
> rather
> than simply disabling them?
Yes, I think this is a huge performance improvement. And in this case it's totally
fine to do chunk aligned read with old layout since reshape do not start yet.
>
>>> Since reshape_position is not yet updated, raid5_make_request()
>>> considers no reshape is in progress and proceeds with the
>>> chunk-aligned path, but the layout has already changed, causing
>>> raid5_compute_sector() to return an incorrect physical address.
>>>
>>> Fix this by reading conf->reshape_progress under gen_lock in
>>> raid5_read_one_chunk() and falling back to the stripe path if a
>>> reshape is in progress.
>>>
>>> Signed-off-by: FengWei Shih <dannyshih@xxxxxxxxxxxx>
>>> ---
>>> drivers/md/raid5.c | 8 ++++++++
>>> 1 file changed, 8 insertions(+)
>>>
>>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>>> index a8e8d431071b..bded2b86f0ef 100644
>>> --- a/drivers/md/raid5.c
>>> +++ b/drivers/md/raid5.c
>>> @@ -5421,6 +5421,11 @@ static int raid5_read_one_chunk(struct mddev
>>> *mddev, struct bio *raid_bio)
>>> sector_t sector, end_sector;
>>> int dd_idx;
>>> bool did_inc;
>>> + int seq;
>>> +
>>> + seq = read_seqcount_begin(&conf->gen_lock);
>>> + if (unlikely(conf->reshape_progress != MaxSector))
>>> + return 0;
>>> if (!in_chunk_boundary(mddev, raid_bio)) {
>>> pr_debug("%s: non aligned\n", __func__);
>>> @@ -5431,6 +5436,9 @@ static int raid5_read_one_chunk(struct mddev
>>> *mddev, struct bio *raid_bio)
>>> &dd_idx, NULL);
>>> end_sector = sector + bio_sectors(raid_bio);
>>> + if (read_seqcount_retry(&conf->gen_lock, seq))
>>> + return 0;
>>> +
>>> if (r5c_big_stripe_cached(conf, sector))
>>> return 0;
> Thanks,
> FengWei Shih
>
> Disclaimer: The contents of this e-mail message and any attachments
> are confidential and are intended solely for addressee. The
> information may also be legally privileged. This transmission is sent
> in trust, for the sole purpose of delivery to the intended recipient.
> If you have received this transmission in error, any use, reproduction
> or dissemination of this transmission is strictly prohibited. If you
> are not the intended recipient, please immediately notify the sender
> by reply e-mail or phone and delete this message and its attachments,
> if any.
--
Thansk,
Kuai