Re: 2.6.12-rc2 + rc3: reaim with ext3 - system stalls.

From: Jan Kara
Date: Tue May 03 2005 - 09:48:02 EST


Hello,

> Started seeing some odd behaviour with recent kernels, haven't been able to
> run it down, could use some suggestions/help.
>
> Running re-aim7 with 2.6.12-rc2 and rc3, if I use xfs, jfs, or
> reiserfs things work just fine.
>
> With ext3, the test stalls, such that:
> CPU is 50% idle, 50% waiting IO (top)
> vmstat shows one process blocked wio
I've looked through your dumps and I spotted where is the problem -
it's our well known and beloved lock inversion between PageLock and
transaction start (giving CC to Badari who's the author of the patch
that introduced it AFAIK).
The correct order is: first get PageLock and *then* start transaction.
But in ext3_writeback_writepages() first ext3_journal_start() is called
and then __mpage_writepages is called that tries to do LockPage and
deadlock is there. Badari, could you please fix that (sadly I think that
would not be easy)? Maybe we should back out those changes until it gets
fixed...

Honza

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/