Re: [PATCH v2] md/raid5: read batch_head under stripe_lock in make_stripe_request
From: yu kuai
Date: Sat Jun 20 2026 - 18:02:11 EST
Hi,
在 2026/6/19 16:11, Chen Cheng 写道:
> From: Chen Cheng <chencheng@xxxxxxxxx>
>
> KCSAN reports race in raid5_make_request() vs. stripe_add_to_batch_list()
>
> Writer flow (stripe_add_to_batch_list):
> 1. grab `head` stripe;
> 2. lock_two_stripes(head, sh);
> 3. re-check stripe_can_batch() for both head and sh, which requires
> STRIPE_BATCH_READY set on both;
> 4. write head->batch_head = head and sh->batch_head = head;
> 5. unlock_two_stripes.
>
> STRIPE_BATCH_READY is cleared in two places:
> - clear_batch_ready(), at the entry of handle_stripe();
> - __add_stripe_bio(), for non-batchable bios.
> And, both need to acquire `stripe_lock`.
>
> Under stripe_lock, if STRIPE_BATCH_READY is clear, then:
> - New writers cannot install a batch_head;
> - Existing writers have already finished.
> So .. handle_stripe() readers can ready `batch_head` locklessly.
This does not explain the race clearly, I still have no clue yet.
>
> Fix way:
> Writer side make_stripe_request() under STRIPE_BATCH_READY, so , need
> to be protected by stripe_lock when read something..
>
> v1 -> v2:
> - re-expalin how stripe_lock and batch_head work in commit message , and ,
> - modify comment in raid5.h.
>
> Fixs: f4aec6a097387
Weird fix tag again.
>
>
> KCSAN report:
> ======================================
> BUG: KCSAN: data-race in raid5_make_request / raid5_make_request
>
> write to 0xffff8f03062432d8 of 8 bytes by task 210246 on cpu 6:
> raid5_make_request+0x175e/0x2ab0
> md_handle_request+0x2c5/0x700
> md_submit_bio+0x126/0x320
> [.........]
> btrfs_sync_file+0x181/0x970
> vfs_fsync_range+0x71/0x110
> do_fsync+0x46/0xa0
> __x64_sys_fsync+0x20/0x30
>
> read to 0xffff8f03062432d8 of 8 bytes by task 210251 on cpu 0:
> raid5_make_request+0x7c7/0x2ab0
> md_handle_request+0x2c5/0x700
> md_submit_bio+0x126/0x320
> [.........]
> btrfs_remap_file_range+0x266/0x980
> vfs_clone_file_range+0x16d/0x610
> ioctl_file_clone+0x64/0xd0
> do_vfs_ioctl+0x87f/0xbc0
> __x64_sys_ioctl+0xb8/0x130
>
> value changed: 0x0000000000000000 -> 0xffff8f0307798728
Is this a mismatch report?
>
>
> Signed-off-by: Chen Cheng <chencheng@xxxxxxxxx>
> ---
> drivers/md/raid5.c | 2 ++
> drivers/md/raid5.h | 8 +++++++-
> 2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 5521051a9425..efc63740f867 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6108,14 +6108,16 @@ static enum stripe_result make_stripe_request(struct mddev *mddev,
> ctx->do_flush = false;
> }
>
> set_bit(STRIPE_HANDLE, &sh->state);
> clear_bit(STRIPE_DELAYED, &sh->state);
> + spin_lock_irq(&sh->stripe_lock);
> if ((!sh->batch_head || sh == sh->batch_head) &&
> (bi->bi_opf & REQ_SYNC) &&
> !test_and_set_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
> atomic_inc(&conf->preread_active_stripes);
> + spin_unlock_irq(&sh->stripe_lock);
>
> release_stripe_plug(mddev, sh);
> return STRIPE_SUCCESS;
>
> out_release:
> diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
> index 1c7b710fc9c1..9ff825697ba3 100644
> --- a/drivers/md/raid5.h
> +++ b/drivers/md/raid5.h
> @@ -221,11 +221,17 @@ struct stripe_head {
> enum reconstruct_states reconstruct_state;
> spinlock_t stripe_lock;
> int cpu;
> struct r5worker_group *group;
>
> - struct stripe_head *batch_head; /* protected by stripe lock */
> + /*
> + * Writer protected by stripe_lock.
> + * Reader hold stripe_lock when STRIPE_BATCH_READY is set.
> + * Without STRIPE_BATCH_READY means no concurrent write,
> + * lockless read is ok.
> + */
> + struct stripe_head *batch_head;
> spinlock_t batch_lock; /* only header's lock is useful */
> struct list_head batch_list; /* protected by head's batch lock*/
>
> union {
> struct r5l_io_unit *log_io;
--
Thanks,
Kuai