Re: WARNING in jbd2_journal_update_sb_log_tail
From: Jan Kara
Date: Wed Jan 15 2025 - 12:53:55 EST
On Wed 15-01-25 13:00:23, Heming Zhao wrote:
> Hello Jan,
>
> On 1/15/25 09:32, Liebes Wang wrote:
> > The bisection log shows the first cause commit is a09decff5c32060639a685581c380f51b14e1fc2:
> > a09decff5c32 jbd2: clear JBD2_ABORT flag before journal_reset to update log tail info when load journal
> >
> > The full bisection log is attached. Hope this helps.
>
> This bisearch commit a09decff5c32 appears to be the root cause
> of this issue. It fixed one issue but introduced another.
>
> Syzbot tested the patch with calling jbd2_journal_wipe() with 'write=1'.
> The Syzbot test result [1] shows that the same WARN_ON() is triggered
> in a subsequent routine – the classic whack-a-mole!
>
> Back to commit a09decff5c32, it opened a door to allow jbd2 to update
> sb regardless of whether the value of sb items are correct.
>
> To fix a09decff5c32, it seems that jbd2 needs to add more sanity check
> codes in a sub-routine of jbd2_journal_load().
>
> btw, in my view, this is a jbd2 issue not ocfs2/ext4 issue.
>
> [1]: https://lore.kernel.org/ocfs2-devel/04a9ad29-51de-4b50-a5bb-56f91817639d@xxxxxxxx/T/#m86d01f83d808868bb5e6548d30f79b4f9f889b13
Thanks for debugging this! So I'm not 100% convinced this is only jbd2 bug
because jbd2_journal_recover() was never intended to be called after
jbd2_journal_skip_recovery() (called from jbd2_journal_wipe()). You're
supposed to call either jbd2_journal_wipe() or jbd2_journal_recover() but
not both. So IMO this needs fixing in OCFS2 code. That being said you've
also pointed at one bug in jbd2 code - the WARN_ON(!sb->s_sequence) in
jbd2_journal_update_sb_log_tail() is indeed wrong. We were inconsistent
inside jbd2 whether TID 0 is considered valid or not and relatively
recently we've decided to accept TID 0 as valid but this place was left
out. I'll send a fix for that.
Honza
> > Heming Zhao <heming.zhao@xxxxxxxx <mailto:heming.zhao@xxxxxxxx>> 于2025年1月14日周二 22:51写道:
> >
> > Hi Ted,
> >
> > On 1/14/25 21:38, Theodore Ts'o wrote:
> > > On Tue, Jan 14, 2025 at 02:25:21PM +0800, Heming Zhao wrote:
> > >>
> > >> The root cause appears to be that the jbd2 bypass recovery logic
> > >> is incorrect.
> > >
> > > Heming, thanks for taking a look.
> > >
> > > I'm not convinced the root cause is what you've stated. When
> > > jbd2_journal_wipe() calls jbd2_mark_journal_empty(), s_start gets set
> > > to zero:
> >
> > Actually, ocfs2 calls jbd2_journal_wipe() with 'write=0' (hard coded),
> > so jbd2_mark_journal_empty() isn't called during the ocfs2 mount
> > phase. This means the following deduction won't apply in this case.
> >
> > -- Heming
> >
> > >
> > > sb->s_start = cpu_to_be32(0);
> > >
> > > This then gets checked in jbd2_journal_recovery:
> > >
> > > if (!sb->s_start) {
> > > jbd2_debug(1, "No recovery required, last transaction %d, head block %u\n",
> > > be32_to_cpu(sb->s_sequence), be32_to_cpu(sb->s_head));
> > > journal->j_transaction_sequence = be32_to_cpu(sb->s_sequence) + 1;
> > > journal->j_head = be32_to_cpu(sb->s_head);
> > > return 0;
> > > }
> > >
> > > I suspect that there is something else wrong with jbd2's superblock,
> > > since this normally works in the absence of malicious fs image
> > > fuzzing, such that when jbd2_journal_load() calls reset_journal()
> > > after jbd2_journal_recover() correctly bypasses recovery, the WARN_ON
> > > gets triggered.
> > >
> > > I'd suggest that you enable jbd2 debugging so we can see all of the
> > > jbd2_debug() message to understand what might be going on.
> > >
> > > By the way, given that this is only a WARN_ON, and it involves
> > > malicious image fuzzing, this is probably a valid jbd2 bug, but it's
> > > not actually a security bug. Sure, someone silly enough to pick up a
> > > maliciously corrupted USB thumb drive dropped in a parking lot and
> > > insert it into their desktop, and the distribution is silly enoough to
> > > allow automount, the worse that can happen is that the system to
> > > reboot if the system is configured to panic on a WARNING. So feel
> > > free to prioritize your investigation appropriately. :-)
> > >
> > > Cheers,
> > >
> > > - Ted
> >
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR