Re: [PATCH 3.13.y-ckt 10/60] md/raid10: always set reshape_safe when initializing reshape_position.

From: Kamal Mostafa
Date: Fri Sep 04 2015 - 15:39:53 EST


On Tue, 2015-09-01 at 17:57 -0700, Kamal Mostafa wrote:
> 3.13.11-ckt26 -stable review patch. If anyone has any objections, please let me know.

I'm deferring this commit until the next 3.13-stable release (along with
"md: flush ->event_work before stopping array.") as per the guidance on
their cc: stable lines.

-Kamal


> ------------------
>
> From: NeilBrown <neilb@xxxxxxxx>
>
> commit 299b0685e31c9f3dcc2d58ee3beca761a40b44b3 upstream.
>
> 'reshape_position' tracks where in the reshape we have reached.
> 'reshape_safe' tracks where in the reshape we have safely recorded
> in the metadata.
>
> These are compared to determine when to update the metadata.
> So it is important that reshape_safe is initialised properly.
> Currently it isn't. When starting a reshape from the beginning
> it usually has the correct value by luck. But when reducing the
> number of devices in a RAID10, it has the wrong value and this leads
> to the metadata not being updated correctly.
> This can lead to corruption if the reshape is not allowed to complete.
>
> This patch is suitable for any -stable kernel which supports RAID10
> reshape, which is 3.5 and later.
>
> Fixes: 3ea7daa5d7fd ("md/raid10: add reshape support")
> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
> Signed-off-by: Kamal Mostafa <kamal@xxxxxxxxxxxxx>
> ---
> drivers/md/raid10.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 1b707ad..b8215a3 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -3597,6 +3597,7 @@ static struct r10conf *setup_conf(struct mddev *mddev)
> /* far_copies must be 1 */
> conf->prev.stride = conf->dev_sectors;
> }
> + conf->reshape_safe = conf->reshape_progress;
> spin_lock_init(&conf->device_lock);
> INIT_LIST_HEAD(&conf->retry_list);
>
> @@ -3804,7 +3805,6 @@ static int run(struct mddev *mddev)
> }
> conf->offset_diff = min_offset_diff;
>
> - conf->reshape_safe = conf->reshape_progress;
> clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
> clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
> set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
> @@ -4149,6 +4149,7 @@ static int raid10_start_reshape(struct mddev *mddev)
> conf->reshape_progress = size;
> } else
> conf->reshape_progress = 0;
> + conf->reshape_safe = conf->reshape_progress;
> spin_unlock_irq(&conf->device_lock);
>
> if (mddev->delta_disks && mddev->bitmap) {
> @@ -4215,6 +4216,7 @@ abort:
> rdev->new_data_offset = rdev->data_offset;
> smp_wmb();
> conf->reshape_progress = MaxSector;
> + conf->reshape_safe = MaxSector;
> mddev->reshape_position = MaxSector;
> spin_unlock_irq(&conf->device_lock);
> return ret;
> @@ -4566,6 +4568,7 @@ static void end_reshape(struct r10conf *conf)
> md_finish_reshape(conf->mddev);
> smp_wmb();
> conf->reshape_progress = MaxSector;
> + conf->reshape_safe = MaxSector;
> spin_unlock_irq(&conf->device_lock);
>
> /* read-ahead size must cover two whole stripes, which is


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/