Re: [PATCH] md/raid5: fix race between reshape and chunk-aligned read

From: Yu Kuai

Date: Sun Apr 19 2026 - 01:27:36 EST


Hi,

在 2026/4/9 13:17, FengWei Shih 写道:
> raid5_make_request() checks mddev->reshape_position to decide whether
> to allow chunk-aligned reads. However in raid5_start_reshape(), the
> layout configuration (raid_disks, algorithm, etc.) is updated before
> mddev->reshape_position is set:
>
> reshape (raid5_start_reshape) read (raid5_make_request)
> ============================== ===========================
> write_seqcount_begin
> update raid_disks, algorithm...
> set conf->reshape_progress
> write_seqcount_end
> check mddev->reshape_position
> * still MaxSector, allow
> raid5_read_one_chunk()
> * use new layout
> raid5_quiesce()
> set mddev->reshape_position

While we're here, I think it's pretty ugly to disable raid5_read_one_chunk
when reshape is not fully done. So a better solution should be:

- data behind reshape: read with new layout
- data ahead of reshape: read with old layout, reshape will also need to wait
for this IO to be done, before reshape can make progress.
- data intersecting the active reshape window: wait for reshape to make progress.

>
> Since reshape_position is not yet updated, raid5_make_request()
> considers no reshape is in progress and proceeds with the
> chunk-aligned path, but the layout has already changed, causing
> raid5_compute_sector() to return an incorrect physical address.
>
> Fix this by reading conf->reshape_progress under gen_lock in
> raid5_read_one_chunk() and falling back to the stripe path if a
> reshape is in progress.
>
> Signed-off-by: FengWei Shih <dannyshih@xxxxxxxxxxxx>
> ---
> drivers/md/raid5.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index a8e8d431071b..bded2b86f0ef 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -5421,6 +5421,11 @@ static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio)
> sector_t sector, end_sector;
> int dd_idx;
> bool did_inc;
> + int seq;
> +
> + seq = read_seqcount_begin(&conf->gen_lock);
> + if (unlikely(conf->reshape_progress != MaxSector))
> + return 0;
>
> if (!in_chunk_boundary(mddev, raid_bio)) {
> pr_debug("%s: non aligned\n", __func__);
> @@ -5431,6 +5436,9 @@ static int raid5_read_one_chunk(struct mddev *mddev, struct bio *raid_bio)
> &dd_idx, NULL);
> end_sector = sector + bio_sectors(raid_bio);
>
> + if (read_seqcount_retry(&conf->gen_lock, seq))
> + return 0;
> +
> if (r5c_big_stripe_cached(conf, sector))
> return 0;
>

--
Thansk,
Kuai