Re: 2.6.18 ext3 panic.

From: Eric Sandeen
Date: Tue Oct 10 2006 - 18:03:51 EST


Jan Kara wrote:

> I think it's really the 1KB block size that makes it happen.
> I've looked at journal_dirty_data() code and I think the following can
> happen:
> sync() eventually ends up in journal_dirty_data(bh) as Eric writes.
> There is finds dirty buffer attached to the comitting transaction. So it drops
> all locks and calls sync_dirty_buffer(bh).
> Now in other process, file is truncated so that 'bh' gets just after EOF.
> As we have 1kb buffers, it can happen that bh is in the partially
> truncated page. Buffer is marked unmapped and clean. But in a moment the page
> is marked dirty and msync() is called. That eventually calls
> set_page_dirty() and all buffers in the page are marked dirty.
> The first process now wakes up, locks the buffer, clears the dirty bit
> and does submit_bh() - Oops.

Hm, just FWIW I have a couple traces* of the buffer getting unmapped
-before- journal_submit_data_buffers ever even finds it...

journal_submit_data_buffers():[fs/jbd/commit.c:242] needs writeout,
adding to array pid 1836
b_state:0x114025 b_jlist:BJ_SyncData cpu:0 b_count:2 b_blocknr:27130
b_jbd:1 b_frozen_data:0000000000000000
b_committed_data:0000000000000000
b_transaction:1 b_next_transaction:0 b_cp_transaction:0
b_trans_is_running:0
b_trans_is_comitting:1 b_jcount:0 pg_dirty:0

so it's already unmapped at this point. Could
journal_submit_data_buffers benefit from some buffer_mapped checks? Or
is that just a bandaid too late...

-Eric

*http://people.redhat.com/esandeen/traces/eric_ext3_oops1.txt
http://people.redhat.com/esandeen/traces/eric_ext3_oops2.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/