Re: [PATCH] md/raid5: protect batch_head->bm_seq updates
From: Chen Cheng
Date: Thu Jun 18 2026 - 07:27:07 EST
在 2026/6/18 18:36, Paul Menzel 写道:
> Dear Chen,
>
>
> Thank you for very much.
>
> Am 18.06.26 um 08:55 schrieb Chen Cheng:
>> From: Chen Cheng <chencheng@xxxxxxxxx>
>>
>> bm_seq means "stripe delay to flush until bm_seq <= seq_write".
>>
>> do_release_stripe() keeps STRIPE_BIT_DELAY stripes on bitmap_list
>> when bm_seq >= seq_write.
>>
>> after raid5d() flushes bitmap update and ++seq_write, and
>> active_bit_delay() retry to release delayed stripes.
>>
>> the stripe batch head must carry the newest bm_seq among all
>> member stripes, because the whole batch later released according
>> to the batch head state and bm_seq.
>>
>> race scenario:
>> ===================
>> 1. cpu0 - sh0->bm_seq=101; cpu1 - sh1->bm_seq=102;
>> 2. both cpu0 and cpu1 read batch_head->bm_seq = 100;
>> 3. cpu1 write 102, and cpu0 overwrite with 101;
>>
>> the point is, if the head has a lower bm_seq than one of its
>> members, the whole batch could be released before that
>> member's bitmap is flushed.
>> and the on-disk bitmap not record sh1's changes.
>
> It’s a little hard to read. Could you please improve the wording of the
> last paragraph, and maybe also start each sentence with a capital
> letter. Maybe also use 75 characters per line.
>
> Do you have a reproducer by any chance?
Hi Paul,
Thanks to review, and , I will follow your advise.
Actually, I have some reproducer to hit KCSAN reports in RAID-5, but not
for this one. Because it's reported by sashiko-review bot, and , I think
it's a true risk.
I will try to make a reproducer for this case later , after I figure-out
the other KCSAN reports.
>
>> Signed-off-by: Chen Cheng <chencheng@xxxxxxxxx>
>
> Also add a Fixes: tag?
>
>> ---
>> drivers/md/raid5.c | 13 ++++++-------
>> 1 file changed, 6 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index a08230aac711..ee145a7bf9e8 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -980,32 +980,31 @@ static void stripe_add_to_batch_list(struct
>> r5conf *conf,
>> /*
>> * at this point, head's BATCH_READY could be cleared, but we
>> * can still add the stripe to batch list
>> */
>> list_add(&sh->batch_list, &head->batch_list);
>> - spin_unlock(&head->batch_head->batch_lock);
>> } else {
>> head->batch_head = head;
>> sh->batch_head = head->batch_head;
>> spin_lock(&head->batch_lock);
>> list_add_tail(&sh->batch_list, &head->batch_list);
>> - spin_unlock(&head->batch_lock);
>> }
>> - if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
>> - if (atomic_dec_return(&conf->preread_active_stripes)
>> - < IO_THRESHOLD)
>> - md_wakeup_thread(conf->mddev->thread);
>> -
>> if (test_and_clear_bit(STRIPE_BIT_DELAY, &sh->state)) {
>> int seq = sh->bm_seq;
>> if (test_bit(STRIPE_BIT_DELAY, &sh->batch_head->state) &&
>> sh->batch_head->bm_seq - seq > 0)
>> seq = sh->batch_head->bm_seq;
>> set_bit(STRIPE_BIT_DELAY, &sh->batch_head->state);
>> sh->batch_head->bm_seq = seq;
>> }
>> + spin_unlock(&head->batch_head->batch_lock);
>> +
>> + if (test_and_clear_bit(STRIPE_PREREAD_ACTIVE, &sh->state))
>> + if (atomic_dec_return(&conf->preread_active_stripes)
>> + < IO_THRESHOLD)
>> + md_wakeup_thread(conf->mddev->thread);
>> atomic_inc(&sh->count);
>> unlock_out:
>> unlock_two_stripes(head, sh);
>> out:
>
>
> Kind regards,
>
> Paul