Greetings,
This problem is reproduced by our customer using their own testing tool “run_bug”.
When I consulted with a client, the testing tool “run_bug” used a variety of background programs to benchmark (including memory pressure, cpu pressure, file cycle manipulation, fsstress Stress testing tool, postmark program,and so on).
The recurrence probability is relatively low.
In response to your query, in ext4_block_write_begin, the new state will be clear before get block, and the bh that failed get_block will not be set to new.
However, when the page size is greater than the block size, a page will contain multiple bh. bh->b_this_page is a circular list for managing all bh on the same page.
After get_block jumps out of the for loop, then bh->b_this_page is not processed by clear new in the for loop.
So after calling ext4_journalled_zero_new_buffers,
The ext4_journalled_zero_new_buffers function will determine all bh of the same page and call write_end_fn if they are in new state,
get_block returns err's bh->b_this_page and circular list other unhandled bh if the state was previously set to new.
Because bh not get access, the corresponding transaction is not placed in jh->b_transaction, resulting in a crash.
Therefore, the patch processing method I submit is to make unprocessed bh determines if it is in new state and get access.
There is another way to handle the remaining bh clear_buffer_new that is not processed.
I personally recommend get access this way, the impact is small. Please guide the two processing methods, which one do you recommend?
On Thu 10-10-24 10:58:55, Baolin Liu wrote:
From: Baolin Liu <liubaolin@xxxxxxxxxx>...
Since the merge of commit 3910b513fcdf ("ext4: persist the new uptodate
buffers in ext4_journalled_zero_new_buffers"), a new assertion failure
occurred under a old kernel(ext3, data=journal, pagesize=64k) with
corresponding ported patches:
which was caused by bh dirting without calling
do_journal_get_write_access().
In the loop for all bhs of a page in ext4_block_write_begin(),
when a err occurred, it will jump out of loop.
But that will leaves some bhs being processed and some not,
which will lead to the asserion failure in calling write_end_fn().
Thanks for the patch but I don't understand one thing here: For
ext4_journalled_zero_new_buffers() to call write_end_fn() the buffer must
have buffer_new flag set. That flag can get set only by ext4_get_block()
function when it succeeds in which case we also call
do_journal_get_write_access(). So how is it possible that buffer_new was
set on a buffer on which we didn't call do_journal_get_write_access()? This
indicates there may be some deeper problem hidden. How exactly did you
trigger this problem?
Honza
To fixed that, get write access for the rest unprocessed bhs, just
as what write_end_fn do.
Fixes: 3910b513fcdf ("ext4: persist the new uptodate buffers in ext4_journalled_zero_new_buffers")
Reported-and-tested-by: Zhi Long <longzhi@xxxxxxxxxxxxxx>
Suggested-by: Shida Zhang <zhangshida@xxxxxxxxxx>
Signed-off-by: Baolin Liu <liubaolin@xxxxxxxxxx>
---
fs/ext4/inode.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 54bdd4884fe6..a72f951288e4 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1102,9 +1102,24 @@ int ext4_block_write_begin(handle_t *handle, struct folio *folio,
err = -EIO;
}
if (unlikely(err)) {
- if (should_journal_data)
+ if (should_journal_data) {
+ if (bh != head || !block_start) {
+ do {
+ block_end = block_start + bh->b_size;
+
+ if (buffer_new(bh))
+ if (block_end > from && block_start < to)
+ do_journal_get_write_access(handle,
+ inode, bh);
+
+ block_start = block_end;
+ bh = bh->b_this_page;
+ } while (bh != head);
+ }
+
ext4_journalled_zero_new_buffers(handle, inode, folio,
from, to);
+ }
else
folio_zero_new_buffers(folio, from, to);
} else if (fscrypt_inode_uses_fs_layer_crypto(inode)) {
--
2.39.2