Re: [PATCH] md/raid5: use stripe state snapshot in break_stripe_batch_list()
From: yu kuai
Date: Sat Jun 20 2026 - 17:39:13 EST
Hi,
在 2026/6/18 21:47, Chen Cheng 写道:
> From: Chen Cheng <chencheng@xxxxxxxxx>
>
> The patch just suppress KCSAN noise. No functional change.
>
> RAID-5 can group multi full-stripe-write aka stripe_head into a
> batch aka batch_list, with one head_sh leading them. Call
> break_stripe_batch_list() when the batch is finished, or,
> a stripe has to be dropped out of the batch.
>
> break_stripe_batch_list() reads stripe state several times while
> request paths can update thost state words concurrently with
> lockless bitops, which reported by KCSAN.
>
> Use a snapshot to guarantees that the value used for
> warning, copying, and handle checks is internally consistent
> at current read moment.
>
> KCSAN report:
> ==============================================
> BUG: KCSAN: data-race in __add_stripe_bio / break_stripe_batch_list
>
> write (marked) to 0xffff8e89d4f0b988 of 8 bytes by task 4323 on cpu 3:
> __add_stripe_bio+0x35e/0x400
> raid5_make_request+0x6ac/0x2930
> md_handle_request+0x4a2/0xa40
> md_submit_bio+0x109/0x1a0
> __submit_bio+0x2ec/0x390
> submit_bio_noacct_nocheck+0x457/0x710
> submit_bio_noacct+0x2a7/0xc20
> submit_bio+0x56/0x250
> blkdev_direct_IO+0x54c/0xda0
> blkdev_write_iter+0x38f/0x570
> aio_write+0x22b/0x490
> io_submit_one+0xa51/0xf70
>
> read to 0xffff8e89d4f0b988 of 8 bytes by task 4290 on cpu 4:
> break_stripe_batch_list+0x3ce/0x480
> handle_stripe_clean_event+0x720/0x9b0
> handle_stripe+0x32fb/0x4500
> handle_active_stripes.isra.0+0x6e0/0xa50
> raid5d+0x7e0/0xba0
>
> Signed-off-by: Chen Cheng <chencheng@xxxxxxxxx>
> ---
> drivers/md/raid5.c | 45 ++++++++++++++++++++++++++-------------------
> 1 file changed, 26 insertions(+), 19 deletions(-)
You have another patch that is relied on this patch, they should be in one
patchset. What's worse, they are sent in the reverse order :(
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 26c24986e01c..a376560be92e 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -4906,35 +4906,39 @@ static int clear_batch_ready(struct stripe_head *sh)
> static void break_stripe_batch_list(struct stripe_head *head_sh,
> unsigned long handle_flags)
> {
> struct stripe_head *sh, *next;
> int i;
> + unsigned long state;
>
> list_for_each_entry_safe(sh, next, &head_sh->batch_list, batch_list) {
>
> list_del_init(&sh->batch_list);
>
> - WARN_ONCE(sh->state & ((1 << STRIPE_ACTIVE) |
> - (1 << STRIPE_SYNCING) |
> - (1 << STRIPE_REPLACED) |
> - (1 << STRIPE_DELAYED) |
> - (1 << STRIPE_BIT_DELAY) |
> - (1 << STRIPE_FULL_WRITE) |
> - (1 << STRIPE_BIOFILL_RUN) |
> - (1 << STRIPE_COMPUTE_RUN) |
> - (1 << STRIPE_DISCARD) |
> - (1 << STRIPE_BATCH_READY) |
> - (1 << STRIPE_BATCH_ERR)),
> - "stripe state: %lx\n", sh->state);
> - WARN_ONCE(head_sh->state & ((1 << STRIPE_DISCARD) |
> - (1 << STRIPE_REPLACED)),
> - "head stripe state: %lx\n", head_sh->state);
> + state = READ_ONCE(sh->state);
> + WARN_ONCE(state & ((1 << STRIPE_ACTIVE) |
> + (1 << STRIPE_SYNCING) |
> + (1 << STRIPE_REPLACED) |
> + (1 << STRIPE_DELAYED) |
> + (1 << STRIPE_BIT_DELAY) |
> + (1 << STRIPE_FULL_WRITE) |
> + (1 << STRIPE_BIOFILL_RUN) |
> + (1 << STRIPE_COMPUTE_RUN) |
> + (1 << STRIPE_DISCARD) |
> + (1 << STRIPE_BATCH_READY) |
> + (1 << STRIPE_BATCH_ERR)),
> + "stripe state: %lx\n", state);
> +
> + state = READ_ONCE(head_sh->state);
> + WARN_ONCE(state & ((1 << STRIPE_DISCARD) |
> + (1 << STRIPE_REPLACED)),
> + "head stripe state: %lx\n", state);
>
> set_mask_bits(&sh->state, ~(STRIPE_EXPAND_SYNC_FLAGS |
> (1 << STRIPE_PREREAD_ACTIVE) |
> (1 << STRIPE_ON_UNPLUG_LIST)),
> - head_sh->state & (1 << STRIPE_INSYNC));
> + state & (1 << STRIPE_INSYNC));
>
> sh->check_state = head_sh->check_state;
> sh->reconstruct_state = head_sh->reconstruct_state;
> spin_lock_irq(&sh->stripe_lock);
> sh->batch_head = NULL;
> @@ -4943,22 +4947,25 @@ static void break_stripe_batch_list(struct stripe_head *head_sh,
> if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags))
> wake_up_bit(&sh->dev[i].flags, R5_Overlap);
> sh->dev[i].flags = head_sh->dev[i].flags &
> (~((1 << R5_WriteError) | (1 << R5_Overlap)));
> }
> - if (handle_flags == 0 ||
> - sh->state & handle_flags)
> +
> + state = READ_ONCE(sh->state);
> + if (handle_flags == 0 || (state & handle_flags))
> set_bit(STRIPE_HANDLE, &sh->state);
> raid5_release_stripe(sh);
> }
> spin_lock_irq(&head_sh->stripe_lock);
> head_sh->batch_head = NULL;
> spin_unlock_irq(&head_sh->stripe_lock);
> for (i = 0; i < head_sh->disks; i++)
> if (test_and_clear_bit(R5_Overlap, &head_sh->dev[i].flags))
> wake_up_bit(&head_sh->dev[i].flags, R5_Overlap);
> - if (head_sh->state & handle_flags)
> +
> + state = READ_ONCE(head_sh->state);
> + if (state & handle_flags)
> set_bit(STRIPE_HANDLE, &head_sh->state);
> }
>
> /*
> * handle_stripe - do things to a stripe.
--
Thanks,
Kuai