Re: [PATCH v1] mm: fix data-race in folio_batch_count()

From: David Hildenbrand (Arm)

Date: Wed Jun 24 2026 - 10:42:45 EST

On 6/24/26 16:23, Lorenzo Stoakes wrote:
> On Wed, Jun 24, 2026 at 05:26:06PM +0800, Xuewen Wang wrote:
>> KCSAN reports:
>>
>> BUG: KCSAN: data-race in __lru_add_drain_all / folio_batch_add_and_move
>
> Where? A syzbot report? A local run? Please specify.
>
>>
>> write to 0xffff98fe74c015f8 of 1 bytes by task 45153 on cpu 2:
>> folio_batch_add+0x30/0xe0
>>
>> read to 0xffff98fe74c015f8 of 1 bytes by task 45175 on cpu 0:
>> folio_batch_count+0x0/0x10
>> cpu_needs_drain+0x253/0x430
>>
>> The write side is a per-cpu local operation (folio_batch_add on the
>> CPU that owns the per-cpu batch), while cpu_needs_drain() reads
>> another CPU's per-cpu batch without locking. Reading a slightly stale
>> value is harmless -- it only determines whether to schedule a drain,
>
> Then why are we adding a READ_ONCE() in such a core helper?
>
>> and a subsequent check will catch it.
>
> Where? Which check? Be specific.
>
>>
>> Use READ_ONCE() to annotate the read and prevent load tearing, which
>> also suppresses the KCSAN warning.
>
> Tearing on a single byte? Which architecture tears a single byte?
>
> I think you're actually more concerned about the value being optimised out on
> assumption of it not being updated elsewhere right?
>
> But acutally you're not, because everybody else uses a stack variable or _their
> own_ per-CPU value?
>
> Only cpu_needs_drain() is the odd one out right?
>
>>
>> Signed-off-by: Xuewen Wang <wangxuewen@xxxxxxxxxx>
>> ---
>> include/linux/folio_batch.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/include/linux/folio_batch.h b/include/linux/folio_batch.h
>> index b45946adc50b..1e31e058e19d 100644
>> --- a/include/linux/folio_batch.h
>> +++ b/include/linux/folio_batch.h
>> @@ -53,7 +53,7 @@ static inline void folio_batch_reinit(struct folio_batch *fbatch)
>>
>> static inline unsigned int folio_batch_count(const struct folio_batch *fbatch)
>> {
>> - return fbatch->nr;
>> + return READ_ONCE(fbatch->nr);
>
> This isn't free, you're breaking optimisations here by doing that...
>
> It feels like the wrong level of abstraction, but actually I think every other
> case is either stack or per-CPU _on its own CPU_ (please check), in which case
> we _can_ suppress the check here but I think best done with data_race().

Fully agreed.

--
Cheers,

David