[PATCH 8/8] md/raid5: reserve stripe cache for user I/O during rebuild
From: Hiroshi Nishida
Date: Wed Jun 24 2026 - 12:03:30 EST
The resync read-ahead window (RAID5_SYNC_WINDOW) can fill the stripe
cache with rebuild stripes and starve concurrent user I/O, producing a
burst-starvation flip-flop between rebuild and application throughput.
Add two yield points to the window-submission loop:
- stop the window immediately if any thread is waiting for a stripe
(waitqueue_active(&conf->wait_for_stripe)); the check is intentionally
racy -- a waiter appearing just after is serviced by the next
sync_request call, so no barrier is needed.
- stop expanding once active_stripes reaches half the cache
(max_nr_stripes / RAID5_SYNC_HWMARK), but only when
preread_active_stripes > 0, i.e. user write I/O is actually competing.
Sync stripes never set STRIPE_PREREAD_ACTIVE, so during a pure rebuild
the counter stays zero and the window fills freely; rebuild-only
throughput is unchanged.
This bounds the share of the stripe cache a rebuild may hold while user
I/O is present, so application latency no longer collapses during the
read-ahead bursts, without throttling a rebuild that has the array to
itself.
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Hiroshi Nishida <nishidafmly@xxxxxxxxx>
---
drivers/md/raid5.c | 21 +++++++++++++++++++++
drivers/md/raid5.h | 1 +
2 files changed, 22 insertions(+)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index ad6230415af3..480f3aa069ef 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6656,6 +6656,27 @@ static inline sector_t raid5_sync_request(struct mddev *mddev, sector_t sector_n
submitted < RAID5_SYNC_WINDOW && win_sector < max_sector &&
win_sector < mddev->resync_max;
submitted++, win_sector += RAID5_STRIPE_SECTORS(conf)) {
+ /*
+ * Yield to user I/O: stop the read-ahead if anyone is waiting
+ * for a stripe. The check is intentionally racy -- a waiter
+ * appearing just after is serviced by the next sync_request
+ * call, so no barrier is needed.
+ */
+ if (waitqueue_active(&conf->wait_for_stripe))
+ break;
+ /*
+ * Reserve cache for user I/O only when it is actually competing.
+ * preread_active_stripes counts stripes queued for write I/O
+ * (including the read phase of RMW); sync stripes never set
+ * STRIPE_PREREAD_ACTIVE, so during a pure rebuild it stays zero
+ * and the window fills freely. Competing user reads do not bump
+ * the counter but are caught by the waitqueue_active() check
+ * above.
+ */
+ if (atomic_read(&conf->preread_active_stripes) > 0 &&
+ atomic_read(&conf->active_stripes) >=
+ conf->max_nr_stripes / RAID5_SYNC_HWMARK)
+ break;
sh = raid5_get_active_stripe(conf, NULL, win_sector,
R5_GAS_NOBLOCK);
if (!sh)
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 1f37dabd727b..7833cc07597f 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -499,6 +499,7 @@ struct disk_info {
#define MAX_STRIPE_BATCH 32 /* stripes per handle_active_stripes pass */
#define STRIPE_BATCH_WORKERS 8 /* stripes-per-worker threshold for spawning */
#define RAID5_SYNC_WINDOW 32 /* stripes to pre-submit per sync_request call */
+#define RAID5_SYNC_HWMARK 2 /* rebuild uses at most 1/N of stripe cache */
/* NR_STRIPE_HASH_LOCKS must be a power of two, since
* STRIPE_HASH_LOCKS_MASK masks with (NR_STRIPE_HASH_LOCKS - 1).
--
2.43.0