Re: [PATCH -next] md: split MD_RECOVERY_NEEDED out of mddev_resume

From: Paul Menzel
Date: Mon Dec 04 2023 - 09:09:28 EST


Dear Yu,


Thank you for your patch.

Am 04.12.23 um 04:17 schrieb Yu Kuai:
From: Yu Kuai <yukuai3@xxxxxxxxxx>

New mddev_resume() calls are added to synchroniza IO with array

synchronize

reconfiguration, however, this introduce a regression while adding it in

1. Maybe: … performance regression …
2. introduce*s*

md_start_sync():

1) someone set MD_RECOVERY_NEEDED first;

set*s*

2) daemon thread grab reconfig_mutex, then clear MD_RECOVERY_NEEDED and
queue a new sync work;

grab*s*, clear*s*, queue*s*

3) daemon thread release reconfig_mutex;

release*s*

4) in md_start_sync
a) check that there are spares that can be added/removed, then suspend
the array;
b) remove_and_add_spares may not be called, or called without really
add/remove spares;
c) resume the array, then set MD_RECOVERY_NEEDED again!

Loop between 2 - 4, then mddev_suspend() will be called quite often, for
consequence, normal IO will be quite slow.

It’d be great if you could document the exact “test case”, and numbers.

Fix this problem by spliting MD_RECOVERY_NEEDED out of mddev_resume(), so

split*t*ing

that md_start_sync() won't set such flag and hence the loop will be broken.

Fixes: bc08041b32ab ("md: suspend array in md_start_sync() if array need reconfiguration")
Reported-and-tested-by: Janpieter Sollie <janpieter.sollie@xxxxxxxxx>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218200
Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
---
drivers/md/dm-raid.c | 1 +
drivers/md/md-bitmap.c | 2 ++
drivers/md/md.c | 6 +++++-
drivers/md/raid5.c | 4 ++++
4 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index eb009d6bb03a..e9c0d70f7fe5 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -4059,6 +4059,7 @@ static void raid_resume(struct dm_target *ti)
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
mddev->ro = 0;
mddev->in_sync = 0;
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
}
}
diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 9672f75c3050..16112750ee64 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2428,6 +2428,7 @@ location_store(struct mddev *mddev, const char *buf, size_t len)
}
rv = 0;
out:
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
if (rv)
return rv;
@@ -2571,6 +2572,7 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
if (old_mwb != backlog)
md_bitmap_update_sb(mddev->bitmap);
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
return len;
}
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4b1e8007dd15..48a1b12f3c2c 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -515,7 +515,6 @@ void mddev_resume(struct mddev *mddev)
percpu_ref_resurrect(&mddev->active_io);
wake_up(&mddev->sb_wait);
- set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
@@ -4146,6 +4145,7 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
md_new_event();
rv = len;
out_unlock:
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
return rv;
}
@@ -4652,6 +4652,8 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
out:
if (err)
export_rdev(rdev, mddev);
+ else
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
if (!err)
md_new_event();
@@ -5533,6 +5535,7 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
mddev_destroy_serial_pool(mddev, NULL);
mddev->serialize_policy = value;
unlock:
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
return err ?: len;
}
@@ -6593,6 +6596,7 @@ static void autorun_devices(int part)
export_rdev(rdev, mddev);
}
autorun_array(mddev);
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
}
/* on success, candidates will be empty, on error
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 42ba3581cfea..f88f92517a18 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6989,6 +6989,7 @@ raid5_store_stripe_size(struct mddev *mddev, const char *page, size_t len)
mutex_unlock(&conf->cache_size_mutex);
out_unlock:
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
return err ?: len;
}
@@ -7090,6 +7091,7 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len)
else
blk_queue_flag_clear(QUEUE_FLAG_STABLE_WRITES, q);
}
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
return err ?: len;
}
@@ -7169,6 +7171,7 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
kfree(old_groups);
}
}
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
return err ?: len;
@@ -8920,6 +8923,7 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
if (!err)
md_update_sb(mddev, 1);
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
mddev_unlock_and_resume(mddev);
return err;

Acked-by: Paul Menzel <pmenzel@xxxxxxxxxxxxx>


Kind regards,

Paul