[RFC PATCH] jbd2: fix a potential assertion failure due to improperly dirtied buffer

From: zhangshida
Date: Sat Jul 20 2024 - 02:24:13 EST


From: Shida Zhang <zhangshida@xxxxxxxxxx>

On an old kernel version(4.19, ext3, journal=data, pagesize=64k),
an assertion failure will occasionally be triggered by the line below:

jbd2_journal_commit_transaction
{
...
J_ASSERT_BH(bh, !buffer_dirty(bh));
/*
* The buffer on BJ_Forget list and not jbddirty means
...
}

AFAIC, that's how the problem works:
--------
journal_unmap_buffer
jbd2_journal_invalidatepage
__ext4_journalled_invalidatepage
ext4_journalled_invalidatepage
do_invalidatepage
truncate_inode_pages_range
truncate_inode_pages
truncate_pagecache
ext4_setattr
--------
First try to truncate and invalidate the page.
Sometimes the buffer and the page won't be freed immediately.
the buffer will be sent to the BJ_Forget list of the currently
committing transaction. Maybe the transaction knows when and how
to free the buffer better.
The buffer's states now: !jbd_dirty !mapped !dirty

Then jbd2_journal_commit_transaction()will try to iterate over the
BJ_Forget list:
--------
jbd2_journal_commit_transaction()
while (commit_transaction->t_forget) {
...
}
--------

At this exact moment, another write comes:
--------
mark_buffer_dirty
__block_write_begin_int
__block_write_begin
ext4_write_begin
--------
it sees a unmapped new buffer, and marks it as dirty.

Finally, jbd2_journal_commit_transaction()will trip over the assertion
during the BJ_Forget list iteration.

After an code analysis, it seems that nothing can prevent the
__block_write_begin() from dirtying the buffer at this very moment.

The same condition may also be applied to the lattest kernel version.

To fix it:
Add some checks to see if it is the case dirtied by __block_write_begin().
if yes, it's okay and just let it go. The one who dirtied it and the
jbd2_journal_commit_transaction() will know how to cooperate together and
deal with it in a proper manner.
if no, try to complain as we normally will do.

Reported-by: Baolin Liu <liubaolin@xxxxxxxxxx>
Signed-off-by: Shida Zhang <zhangshida@xxxxxxxxxx>
---
fs/jbd2/commit.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index 75ea4e9a5cab..2c13d0af92d8 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -1023,7 +1023,20 @@ void jbd2_journal_commit_transaction(journal_t *journal)
if (is_journal_aborted(journal))
clear_buffer_jbddirty(bh);
} else {
- J_ASSERT_BH(bh, !buffer_dirty(bh));
+ bool try_to_complain = 1;
+ struct folio *folio = NULL;
+
+ folio = bh->b_folio;
+ /*
+ * Try not to complain about the dirty buffer marked dirty
+ * by __block_write_begin().
+ */
+ if (buffer_dirty(bh) && folio && folio->mapping
+ && folio_test_locked(folio))
+ try_to_complain = 0;
+
+ if (try_to_complain)
+ J_ASSERT_BH(bh, !buffer_dirty(bh));
/*
* The buffer on BJ_Forget list and not jbddirty means
* it has been freed by this transaction and hence it
--
2.33.0