Re: [PATCH -next 1/5] md/raid5: don't allow replacement while reshape is not done

From: Song Liu
Date: Fri May 19 2023 - 19:34:19 EST


On Thu, May 11, 2023 at 6:59 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>
> Set rdev replacement has but not only two conditions:
>
> 1) MD_RECOVERY_RUNNING is not set;
> 2) rdev nr_pending is 0;

The above is confusing. I updated it and applied the set to md-next.
Please let me know if it looks good.

Thanks,
Song

>
> If reshape is interrupted(for example, echo frozen to sync_action), then
> rdev replacement can be set. It's safe because reshape is always prior to
> resync in md_check_recovery(). However, if system reboots, then kernel will
> complain cannot handle concurrent replacement and reshape and this array
> is not able to assemble anymore.
>
> Fix this problem by don't allow replacement until reshape is done.
>
> Reported-by: Peter Neuwirth <reddunur@xxxxxxxxx>
> Link: https://lore.kernel.org/linux-raid/e2f96772-bfbc-f43b-6da1-f520e5164536@xxxxxxxxx/
> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
> ---
> drivers/md/raid5.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index a58507a4345d..bd3b535c0739 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -8378,6 +8378,7 @@ static int raid5_add_disk(struct mddev *mddev, struct md_rdev *rdev)
> p = conf->disks + disk;
> tmp = rdev_mdlock_deref(mddev, p->rdev);
> if (test_bit(WantReplacement, &tmp->flags) &&
> + mddev->reshape_position == MaxSector &&
> p->replacement == NULL) {
> clear_bit(In_sync, &rdev->flags);
> set_bit(Replacement, &rdev->flags);
> --
> 2.39.2
>