Re: [PATCH v2 2/2] jbd2: gracefully abort on transaction state corruptions

From: Andreas Dilger

Date: Mon Mar 02 2026 - 18:32:05 EST


On Mar 2, 2026, at 14:34, Milos Nikic <nikic.milos@xxxxxxxxx> wrote:
>
> Auditing the jbd2 codebase reveals several legacy J_ASSERT calls
> that enforce internal state machine invariants (e.g., verifying
> jh->b_transaction or jh->b_next_transaction pointers).
>
> When these invariants are broken, the journal is in a corrupted
> state. However, triggering a fatal panic brings down the entire
> system for a localized filesystem error.
>
> This patch targets a specific class of these asserts: those
> residing inside functions that natively return integer error codes,
> booleans, or error pointers. It replaces the hard J_ASSERTs with
> WARN_ON_ONCE to capture the offending stack trace, safely drops
> any held locks, gracefully aborts the journal, and returns -EINVAL.
>
> This prevents a catastrophic kernel panic while ensuring the
> corrupted journal state is safely contained and upstream callers
> (like ext4 or ocfs2) can gracefully handle the aborted handle.
>
> Functions modified in fs/jbd2/transaction.c:
> - jbd2__journal_start()
> - do_get_write_access()
> - jbd2_journal_dirty_metadata()
> - jbd2_journal_forget()
> - jbd2_journal_try_to_free_buffers()
> - jbd2_journal_file_inode()
>
> Signed-off-by: Milos Nikic <nikic.milos@xxxxxxxxx>

Looks good, though a minor suggestion for some of the replacements.

Reviewed-by: Andreas Dilger <adilger@xxxxxxxxx <mailto:adilger@xxxxxxxxx>>

> @@ -1069,13 +1076,24 @@ do_get_write_access(handle_t *handle, struct
> JBUFFER_TRACE(jh, "owned by older transaction");
> - J_ASSERT_JH(jh, jh->b_next_transaction == NULL);
> - J_ASSERT_JH(jh, jh->b_transaction == journal->j_committing_transaction);
> + if (WARN_ON_ONCE(jh->b_next_transaction ||
> + jh->b_transaction !=
> + journal->j_committing_transaction)) {
> + spin_unlock(&jh->b_state_lock);
> + error = -EINVAL;
> + jbd2_journal_abort(journal, error);
> + goto out;
> + }

In cases like this where you are checking multiple conditions in a
single WARN_ON_ONCE() it isn't possible to know which condition
failed. It would be better to add a pr_err() in the failure case to
print b_next_transaction, j_committing_transaction, and b_transaction
so it is easier to debug if this is ever hit.

Cheers, Andreas