Re: 2.6.23-rc6: hanging ext3 dbench tests

From: Jan Kara
Date: Tue Sep 11 2007 - 12:55:48 EST


> I have a couple of failed test runs against 2.6.23-rc6 where the
> job timed out while running dbench over ext3. Both on powerpc,
> though both significantly different hardware setups. A failed
> run like this implies that the machine was still responsive to
> other processes but the dbench was making no progress. There is
> no console diagnostics during the failure.
>
> beavis was lost during a plain ext3 dbench run, having just
> successfully run a complete ext2 run. elm3b19 was lost during an
> ext3 "data=writeback" dbench run, having already completed an plain
> ext2, and ext3 runs.
OK, thanks for report.

> A quick poke at the dbench logs on the second machine shows this
> for the working ext3 dbench run:
>
> 4 clients started
> 4 35288 814.49 MB/sec
> 0 62477 822.99 MB/sec
> Throughput 822.954 MB/sec 4 procs
>
> Whereas the hanging run shows the following continuing until the
> machine is reset, which confirms that the machine as a whole was
> still with us:
>
> 4 clients started
> 4 36479 824.92 MB/sec
> 1 46857 519.98 MB/sec
> 1 46857 346.65 MB/sec
> 1 46857 259.99 MB/sec
> 1 46857 207.99 MB/sec
> 1 46857 173.32 MB/sec
> 1 46857 148.56 MB/sec
> 1 46857 129.99 MB/sec
> 1 46857 115.55 MB/sec
> 1 46857 103.99 MB/sec
> 1 46857 94.54 MB/sec
> 1 46857 86.66 MB/sec
> 1 46857 80.00 MB/sec
> [...]
So the process doing IO is hung. Could you dump stack of all the tasks
(Alt-Sysrq-t) and send the output? We've probably deadlocked
somewhere...

> The first machine is very similar:
>
> 4 clients started
> 4 18468 445.29 MB/sec
> 4 41945 469.36 MB/sec
> 1 46857 346.68 MB/sec
> 1 46857 260.00 MB/sec
> 1 46857 208.00 MB/sec
> [...]
>
> Not sure if there is any significance to the 46857. Though it feels
> like we may be at the end of the run when it fails.
>
> I will try and reproduce this on one of the machines and see if I
> can get any further info.

Honza
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/