Re: [RFC] ext3/jbd race: releasing in-use journal_heads

From: Stephen C. Tweedie
Date: Mon Mar 07 2005 - 20:12:13 EST


Hi,

On Mon, 2005-03-07 at 21:22, Stephen C. Tweedie wrote:

> altgr-scrlck is showing a range of EIPs all in ext3_direct_IO->
> invalidate_inode_pages2_range(). I'm seeing
>
> invalidate_inode_pages2_range()->pagevec_lookup()->find_get_pages()

In invalidate_inode_pages2_range(), what happens if we lookup a pagevec,
get a bunch of pages back, but all the pages in the vec are beyond the
end of the range we want?

That's quite possible: pagevec_lookup() gets told how many pages we
want, but not the index of the end of the range.

The loop over the pages in the pagevec contains this:

lock_page(page);
if (page->mapping != mapping || page->index > end) {
unlock_page(page);
continue;
}
wait_on_page_writeback(page);
next = page->index + 1;

Now, if all of the pages have page->index > end, we'll skip them all...
but we'll do it before we've advanced "next" to indicate that we've made
progress. The next iteration is just going to start in the same place.

Truncate always invalidates to EOF, which is probably why we haven't
seen this before. But O_DIRECT is doing limited range cache
invalidates, and is getting stuck here pretty repeatably.

I'm currently testing the small patch below; it seems to be running OK
so far, although it hasn't been going for long yet. I'll run a longer
test in the morning.

--Stephen

--- linux-2.6-ext3/mm/truncate.c.=K0002=.orig
+++ linux-2.6-ext3/mm/truncate.c
@@ -271,12 +271,13 @@ int invalidate_inode_pages2_range(struct
int was_dirty;

lock_page(page);
+ if (page->mapping == mapping)
+ next = page->index + 1;
if (page->mapping != mapping || page->index > end) {
unlock_page(page);
continue;
}
wait_on_page_writeback(page);
- next = page->index + 1;
if (next == 0)
wrapped = 1;
while (page_mapped(page)) {