Re: [RFC PATCH] fs/buffer: serialize set_buffer_uptodate against concurrent clears

From: Matthew Wilcox

Date: Sat Apr 25 2026 - 22:43:05 EST


On Sat, Apr 25, 2026 at 10:01:37PM -0400, Chao Shi wrote:
> A WARN_ON_ONCE(!buffer_uptodate(bh)) in mark_buffer_dirty() is reachable
> from the buffered write path on a block device when the underlying
> device returns I/O errors at high density. Reproduced by fuzzing an
> NVMe controller (FEMU) that returns crafted error completions for a
> sustained workload from /dev/nvme0n1.
>
> The race is:
>
> CPU A: block_commit_write (folio lock held) CPU B: end_buffer_async_read
> set_buffer_uptodate(bh);
> clear_buffer_uptodate(bh);
> mark_buffer_dirty(bh); /* WARN fires */

Why are we calling clear_buffer_uptodate() in end_buffer_async_read()?
If the buffer is uptodate, we shouldn't be reading into it. If it's
not uptodate, we don't need to clear the uptodate flag because it's
already clear.

I've been deleting calls to ClearPageUptodate and folio_clear_uptodate()
from filesystems; it's almost always the wrong thing to do. But the
buffer cache does have slightly different rules from the page cache,
so this may not translate well.