Re: [PATCH] Change ll_rw_block() calls in JBD

From: Zoltan Menyhart
Date: Tue Jun 20 2006 - 12:31:58 EST


I have got some crashes due to:

Assertion failure in __journal_file_buffer():
"jh->b_transaction == transaction || jh->b_transaction == 0"

[<a0000002053b44e0>] __journal_file_buffer+0x420/0x7c0 [jbd]
r32 : e000000161a1f3e0 jh
r33 : e00000010396a380 transaction
r34 : 0000000000000008 jlist == BJ_Locked

*(struct journal_head *) 0xe000000161a1f3e0: // jh
{
b_bh = 0xe00000048bb36930,
b_jcount = 0x0,
b_jlist = 0x1,
b_modified = 0x0,
b_frozen_data = 0x0,
b_committed_data = 0x0,
b_transaction = 0xe0000020014adb80, // ->j_running_transaction
b_next_transaction = 0x0,
b_tnext = 0xe0000001c17306e0,
b_tprev = 0xe00000204757e540,
b_cp_transaction = 0x0,
b_cpnext = 0x0,
b_cpprev = 0x0
}

*(struct buffer_head *) 0xe00000048bb36930: // jh->b_bh
{
b_state = 0x8201d,
b_this_page = 0xe00000048bb33d88,
b_page = 0xa07ffffff9201300,
b_count = {
counter = 0x2
},
b_size = 0x1000,
b_blocknr = 0xadc001,
b_data = 0xe000000492a0e000,
b_bdev = 0xe0000023fe1ca300,
b_end_io = 0xa000000100630be0,
b_private = 0xe000000161a1f3e0,
b_assoc_buffers = {
next = 0xe00000048bb36978,
prev = 0xe00000048bb36978
}

--- Called from --- :

journal_submit_data_buffers+0x200/0x660 [jbd]
r32 : e0000001035ec100 journal
r33 : e00000010396a380 commit_transaction

As you can see, the current "jh" has been stolen for the new
"->j_running_transaction" while we released temporarily "->j_list_lock"
in the middle of "journal_submit_data_buffers()".

Therefore the test "jh->b_jlist != BJ_SyncData", i.e. if it is still
on a (_any_) sync. list is not enough.

--- linux-2.6.16.20-orig/fs/jbd/commit.c 2006-06-20 17:19:47.000000000 +0200
+++ linux-2.6.16.20/fs/jbd/commit.c 2006-06-20 17:35:54.000000000 +0200
@@ -219,15 +219,26 @@
bufs = 0;
lock_buffer(bh);
spin_lock(&journal->j_list_lock);
+ /* Stolen (e.g. for a new transaction) ? */
+ if (jh != commit_transaction->t_sync_datalist) {
+ unlock_buffer(bh);
+ JBUFFER_TRACE(jh, "stolen sync. data");
+ put_bh(bh);
+ continue;
+ }
/* Someone already cleaned up the buffer? */
- if (!buffer_jbd(bh)
- || jh->b_jlist != BJ_SyncData) {
+
+ // Can this happen???
+
+ if (!buffer_jbd(bh)) {
unlock_buffer(bh);
BUFFER_TRACE(bh, "already cleaned up");
put_bh(bh);
continue;
}
put_bh(bh);
+ J_ASSERT_JH(jh, jh->b_transaction ==
+ commit_transaction);
}
if (test_clear_buffer_dirty(bh)) {
BUFFER_TRACE(bh, "needs writeout, submitting");

I am not really sure that the test "!buffer_jbd(bh)" is really useful.
I left it alone for not introducing a new bug.
If you can confirm that it is not necessary, I can take it away.

Thanks,

Zoltan





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/