Re: [PATCH] ocfs2: revalidate the journal dinode before toggling dirty
From: Joseph Qi
Date: Sun May 10 2026 - 00:02:18 EST
On 5/9/26 9:52 PM, ZhengYuan Huang wrote:
> [BUG]
> A fuzzed OCFS2 image can corrupt the current slot journal dinode while
> mount is still in progress. The mount path first reports the invalid
> journal block and then crashes in shutdown:
>
> kernel BUG at fs/ocfs2/journal.c:1034!
> Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
> RIP: 0010:ocfs2_journal_toggle_dirty+0x2d6/0x340 fs/ocfs2/journal.c:1034
> Call Trace:
> ocfs2_journal_shutdown+0x414/0xc30 fs/ocfs2/journal.c:1116
> ocfs2_mount_volume fs/ocfs2/super.c:1785 [inline]
> ocfs2_fill_super+0x30a9/0x3cd0 fs/ocfs2/super.c:1083
> get_tree_bdev_flags+0x38b/0x640 fs/super.c:1698
> get_tree_bdev+0x24/0x40 fs/super.c:1721
> ocfs2_get_tree+0x21/0x30 fs/ocfs2/super.c:1184
> vfs_get_tree+0x9a/0x370 fs/super.c:1758
> fc_mount fs/namespace.c:1199 [inline]
> do_new_mount_fc fs/namespace.c:3642 [inline]
> do_new_mount fs/namespace.c:3718 [inline]
> path_mount+0x5b8/0x1ea0 fs/namespace.c:4028
> do_mount fs/namespace.c:4041 [inline]
> __do_sys_mount fs/namespace.c:4229 [inline]
> __se_sys_mount fs/namespace.c:4206 [inline]
> __x64_sys_mount+0x282/0x320 fs/namespace.c:4206
> ...
>
>
> [CAUSE]
> ocfs2_journal_toggle_dirty() assumes journal->j_bh still contains the
> same validated dinode that ocfs2_journal_init() locked earlier, and it
> uses BUG_ON() when the buffer no longer looks like a dinode. That
> assumption is too strong. The mount path can force the same current-slot
> journal inode block back in from disk through
> ocfs2_read_journal_inode(..., OCFS2_BH_IGNORE_CACHE) while
> ocfs2_mark_dead_nodes() scans the journal slots. If that reread finds
> corrupted metadata, mount unwinds through ocfs2_journal_shutdown(),
> which reuses journal->j_bh and turns the metadata corruption into a
> kernel BUG.
>
A bit confused.
Since journal dinode is firstly validated, it means image is checked.
Now mount is in progress, how to corrupt it during runtime?
Thanks,
Joseph
> [FIX]
> Revalidate journal->j_bh with ocfs2_validate_inode_block() before
> updating the dirty flag. If the cached journal dinode has become
> invalid, return the corruption error and keep the failure on OCFS2's
> normal read-only/error path instead of crashing the kernel. This
> revalidation happens in the cold path of mount, so the performance
> impact should be negligible.
>
> Signed-off-by: ZhengYuan Huang <gality369@xxxxxxxxx>
> ---
> fs/ocfs2/journal.c | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
> index f9bf3bac085d..c9a972a1304e 100644
> --- a/fs/ocfs2/journal.c
> +++ b/fs/ocfs2/journal.c
> @@ -1021,12 +1021,15 @@ static int ocfs2_journal_toggle_dirty(struct ocfs2_super *osb,
> struct buffer_head *bh = journal->j_bh;
> struct ocfs2_dinode *fe;
>
> - fe = (struct ocfs2_dinode *)bh->b_data;
> + /* The journal inode block can be forced back in from disk while the
> + * mount path is still running, so validate the cached bh again before
> + * updating the journal state on disk.
> + */
> + status = ocfs2_validate_inode_block(osb->sb, bh);
> + if (status < 0)
> + return status;
>
> - /* The journal bh on the osb always comes from ocfs2_journal_init()
> - * and was validated there inside ocfs2_inode_lock_full(). It's a
> - * code bug if we mess it up. */
> - BUG_ON(!OCFS2_IS_VALID_DINODE(fe));
> + fe = (struct ocfs2_dinode *)bh->b_data;
>
> flags = le32_to_cpu(fe->id1.journal1.ij_flags);
> if (dirty)