[PATCH] jbd2: avoid mount failed when commit block is partial submitted

From: Ye Bin
Date: Tue Apr 02 2024 - 05:09:12 EST


We encountered a problem that the file system could not be mounted in
the power-off scenario. The analysis of the file system mirror shows that
only part of the data is written to the last commit block.
To solve above issue, if commit block checksum is incorrect, check the next
block if has valid magic and transaction ID. If next block hasn't valid
magic or transaction ID then just drop the last transaction ignore checksum
error. Theoretically, the transaction ID maybe occur loopback, which may cause
the mounting failure.

Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx>
---
fs/jbd2/recovery.c | 39 +++++++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)

diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
index 1f7664984d6e..0a09f1a5fd1e 100644
--- a/fs/jbd2/recovery.c
+++ b/fs/jbd2/recovery.c
@@ -463,6 +463,41 @@ static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
return tag->t_checksum == cpu_to_be16(csum32);
}

+static int check_incomplete_commit(journal_t *journal, unsigned long next_block,
+ unsigned int next_commit_ID)
+{
+ journal_header_t *tmp;
+ struct buffer_head *bh;
+ int err = 0;
+
+ err = jread(&bh, journal, next_block);
+ if (err)
+ return err;
+
+ tmp = (journal_header_t *)bh->b_data;
+ /*
+ * If the next block does not contain consecutive transactions, it can
+ * be considered that the checksum error of the current commit block
+ * is caused by incomplete commit. Ignore the checksum error and drop
+ * the last transaction.
+ */
+ if (tmp->h_magic != cpu_to_be32(JBD2_MAGIC_NUMBER) ||
+ be32_to_cpu(tmp->h_sequence) != next_commit_ID) {
+ jbd2_debug("JBD2: will drop incomplete transaction %u commit block %lu\n",
+ next_commit_ID - 1, next_block - 1);
+ goto out;
+ }
+
+ pr_err("JBD2: potential continuous transaction detected %u at %lu, "
+ "likely invalid checksum in transaction %u\n",
+ next_commit_ID, next_block, next_commit_ID - 1);
+
+ err = -EFSBADCRC;
+out:
+ brelse(bh);
+ return err;
+}
+
static int do_one_pass(journal_t *journal,
struct recovery_info *info, enum passtype pass)
{
@@ -810,6 +845,10 @@ static int do_one_pass(journal_t *journal,
if (pass == PASS_SCAN &&
!jbd2_commit_block_csum_verify(journal,
bh->b_data)) {
+ if (!check_incomplete_commit(journal,
+ next_log_block,
+ next_commit_ID + 1))
+ goto ignore_crc_mismatch;
chksum_error:
if (commit_time < last_trans_commit_time)
goto ignore_crc_mismatch;
--
2.31.1