Re: [PATCH v2] md/raid5: protect bitmap batch counters aka seq_flush/seq_write consistency
From: yu kuai
Date: Sat Jun 20 2026 - 17:14:22 EST
Hi,
在 2026/6/17 22:28, Chen Cheng 写道:
> From: Chen Cheng <chencheng@xxxxxxxxx>
>
> kcsan detect race :
> - raid5d() closes the current bitmap batch by updating
> conf->seq_flush under conf->device_lock.
> - __add_stripe_bio() read conf->seq_flush without that
> lock when assigning sh->bm_seq.
>
> so, protect seq_flush/seq_write consistency for multiple CPUs by
> READ_ONCE()/WRITE_ONCE() under the path without held device_lock.
>
> re-explain the stripe batch sequence number update flow:
> 1. sh->bm_seq declare which batch number the stripe belongs to
> when perform bitmap-related write.
> ==> bm_seq = seq_flush+1
>
> 2. stripe be handled,
> * if sh->bm_seq - conf->seq_write > 0, means the
> batch stripes **newer than** the last written
> batch, it cannot proceed yet, queued on bitmap_list.
> * otherwise , has already proceed.
>
> 3. raid5d() `++seq_flush` to closes the current batch, means
> * no more stripes join that old batch
> * just-closed batch ready to write-out to disk
>
> 4. raid5d() calls bitmap hooks unplug() or writeout, then,
> `++seq_write` to the same as bm_seq.
>
> - seq_flush - for producer, to close batches.
> - seq_write - for consumer, the checkpoint number.
>
> the report:
> ====================================
> BUG: KCSAN: data-race in __add_stripe_bio / raid5d
>
> write to 0xffff88ba5625d470 of 4 bytes by task 82401 on cpu 0:
> raid5d+0x1d9/0xba0
> [.....]
>
> read to 0xffff88ba5625d470 of 4 bytes by task 82421 on cpu 8:
> __add_stripe_bio+0x332/0x400
> raid5_make_request+0x6ac/0x2930
> md_handle_request+0x4a2/0xa40
> md_submit_bio+0x109/0x1a0
> __submit_bio+0x2ec/0x390
> [.....]
>
> v1 -> v2:
> - remove WRITE_ONCE(conf->seq_write) in held device_lock path.
> - remove READ_ONCE(conf->seq_flush) in held device_lock path.
>
> Signed-off-by: Chen Cheng <chencheng@xxxxxxxxx>
> ---
> drivers/md/raid5.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index a320b71d7117..b2c5a1930841 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -3536,11 +3536,11 @@ static void __add_stripe_bio(struct stripe_head *sh, struct bio *bi,
> pr_debug("added bi b#%llu to stripe s#%llu, disk %d, logical %llu\n",
> (*bip)->bi_iter.bi_sector, sh->sector, dd_idx,
> sh->dev[dd_idx].sector);
>
> if (conf->mddev->bitmap && firstwrite && !sh->batch_head) {
> - sh->bm_seq = conf->seq_flush+1;
> + sh->bm_seq = READ_ONCE(conf->seq_flush) + 1;
> set_bit(STRIPE_BIT_DELAY, &sh->state);
> }
> }
>
> /*
> @@ -5827,11 +5827,11 @@ static void make_discard_request(struct mddev *mddev, struct bio *bi)
> md_write_inc(mddev, bi);
> sh->overwrite_disks++;
> }
> spin_unlock_irq(&sh->stripe_lock);
> if (conf->mddev->bitmap) {
> - sh->bm_seq = conf->seq_flush + 1;
> + sh->bm_seq = READ_ONCE(conf->seq_flush) + 1;
> set_bit(STRIPE_BIT_DELAY, &sh->state);
> }
>
> set_bit(STRIPE_HANDLE, &sh->state);
> clear_bit(STRIPE_DELAYED, &sh->state);
> @@ -6877,16 +6877,17 @@ static void raid5d(struct md_thread *thread)
> clear_bit(R5_DID_ALLOC, &conf->cache_state);
>
> if (
> !list_empty(&conf->bitmap_list)) {
> /* Now is a good time to flush some bitmap updates */
> - conf->seq_flush++;
> + int seq = conf->seq_flush + 1;
checkpatch should warn this about missing blank line after declaration.
> + WRITE_ONCE(conf->seq_flush, seq);
WRITE_ONCE(a, a + 1) is safe, you can see lots of examples.
> spin_unlock_irq(&conf->device_lock);
> if (md_bitmap_enabled(mddev, true))
> mddev->bitmap_ops->unplug(mddev, true);
> spin_lock_irq(&conf->device_lock);
> - conf->seq_write = conf->seq_flush;
> + conf->seq_write = seq;
> activate_bit_delay(conf, conf->temp_inactive_list);
> }
> raid5_activate_delayed(conf);
>
> while ((bio = remove_bio_from_retry(conf, &offset))) {
--
Thanks,
Kuai