Re: [PATCH 0/3] ocfs2: stop BUG_ON crashes in suballoc invalid-dinode paths

From: ZhengYuan Huang

Date: Wed Apr 08 2026 - 23:38:08 EST


On Fri, Apr 3, 2026 at 5:30 PM Joseph Qi <joseph.qi@xxxxxxxxxxxxxxxxx> wrote:
> On 4/3/26 2:30 PM, ZhengYuan Huang wrote:
> > commit 10995aa2451a ("ocfs2: Morph the haphazard
> > OCFS2_IS_VALID_DINODE() checks.") converted several OCFS2 dinode
> > corruption checks from graceful error handling to BUG_ON() under the
> > assumption that every caller only sees validated inode buffers.
> >
> > That assumption does not always hold for JBD-managed buffers. The common
> > inode read path can still hand suballoc code an invalid dinode, which turns
> > crafted filesystem corruption into a kernel panic instead of a normal OCFS2
> > filesystem error.
> >
>
> When inode first read from disk, it will call ocfs2_validate_inode_block()
> to validate if it is valid.
> So it seems this is a code bug once the buffer is modified? Or how it
> happens?
>
> Thanks,
> Joseph

This bug was discovered by our fuzzing framework. The fuzzer mutates
filesystem metadata on disk to test filesystem robustness, but it does
not modify in-memory state.

Due to an unknown issue, the full crash log was truncated, so we
currently cannot deterministically reproduce the bug. We are still
working on reconstructing a reliable reproducer based on partial
traces.

>From our current analysis, one possible explanation is that the
initial inode validation does not guarantee the buffer remains valid
for its entire lifetime:

On mount, OCFS2 loads local system inodes before journal replay, so
the allocator inode can be instantiated and validated first.
Afterwards, dirty journal replay writes filesystem blocks back through
jbd2 recovery using __getblk(j_fs_dev, blocknr) + memcpy(nbh->b_data,
...) + mark_buffer_dirty(), which can overwrite the same cached bh
that was previously validated.

Later, OCFS2 rereads inode blocks through ocfs2_inode_lock paths, and
those read paths explicitly skip inode validation when buffer_jbd(bh)
is set. This is visible both in ocfs2_read_blocks_sync() and
ocfs2_read_blocks(), and the latter even documents that journal-held
buffers never get NeedsValidate set.

Normal allocator updates make these dinode buffers JBD-managed via
ocfs2_journal_access_di() -> jbd2_journal_get_write_access() ->
set_buffer_jbd(bh). So the bug is not that the very first read forgot
to validate; it is that a previously validated system-inode bh can be
changed later, and subsequent JBD-owned rereads bypass validation
before reaching the BUG_ON in suballoc.

So the issue does not appear to be that the initial validation is
missing, but rather that a previously validated buffer can be modified
later (e.g., by journal replay), and subsequent accesses bypass
validation due to JBD state.

We are still investigating and will update if we manage to produce a
reliable reproducer.

Thanks,
ZhengYuan Huang