Re: [RFC] ext3/jbd race: releasing in-use journal_heads

From: Jan Kara
Date: Tue Mar 08 2005 - 10:16:18 EST


> On Mon, 2005-03-07 at 21:08, Stephen C. Tweedie wrote:
> > Right, that was what I was thinking might be possible. But for now I've
> > just done the simple patch --- make sure we don't clear
> > jh->b_transaction when we're just refiling buffers from one list to
> > another. That should have the desired effect here without dangerous
> > messing around with locks.

> Fix destruction of in-use journal_head
> journal_put_journal_head() can destroy a journal_head at any time as
> long as the jh's b_jcount is zero and b_transaction is NULL. It has no
> locking protection against the rest of the journaling code, as the lock
> it uses to protect b_jcount and bh->b_private is not used elsewhere in
> jbd.
> However, there are small windows where b_transaction is getting set
> temporarily to NULL during normal operations; typically this is
> happening in
> __journal_unfile_buffer(jh);
> __journal_file_buffer(jh, ...);
> call pairs, as __journal_unfile_buffer() will set b_transaction to NULL
> and __journal_file_buffer() re-sets it afterwards. A truncate running
> in parallel can lead to journal_unmap_buffer() destroying the jh if it
> occurs between these two calls.
Isn't also the following scenario dangerous?


If journal_unmap_buffer() decides to destroy the buffer before we call
journal_remove_journal_head(), we get an assertion failure in
I believe the right fix really is that anyone having a pointer to
bh/jh should hold a reference counted in b_jcount - so basically the
idea Andrew proposed.
It could work like follows: When you first create jh you get a
reference in b_jcount (that is already happening). If you file a jh in
some transaction list, you will 'delegate' this reference to the
transaction and hence you won't do journal_put_journal_head() after
you're done. Now if you unfile a buffer from the list you 'get back' it's
reference (and you have to journal_put_journal_head() if you're not
going to file the buffer back again). The only thing you have to assure is
the proper list locking so that two different callers cannot 'get back' one
reference but that should be already done...
This solution should be doable without additional locking or big code
changes but maybe the reference policy is a bit too complicated...

> Fix this by adding a variant of __journal_unfile_buffer() which is only
> used for these temporary jh unlinks, and which leaves the b_transaction
> field intact so that we never leave a window open where b_transaction is
> Additionally, trap this error if it does occur, by checking against
> jh->b_jlist being non-null when we destroy a jh.

Jan Kara <jack@xxxxxxx>
SuSE CR Labs
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at