Re: [PATCH] ext4: reduce scheduling latency with delayed allocation

From: tytso
Date: Mon Mar 01 2010 - 22:06:29 EST

On Mon, Mar 01, 2010 at 01:34:35PM +0100, Michal Schmidt wrote:
> mpage_da_submit_io() may process tens of thousands of pages at a time.
> Unless full preemption is enabled, it causes scheduling latencies in the order
> of tens of milliseconds.
> It can be reproduced simply by writing a big file on ext4 repeatedly with
> dd if=/dev/zero of=/tmp/dummy bs=10M count=50
> The patch fixes it by allowing to reschedule in the loop.
> cyclictest can be used to measure the latency. I tested with:
> $ cyclictest -t1 -p 80 -n -i 5000 -m -l 20000
> The results from an UP AMD Turion 2GHz with voluntary preemption:
> Without the patch:
> T: 0 ( 2535) P:80 I:5000 C: 20000 Min: 12 Act: 23 Avg: 3166 Max: 70524
> (i.e. Average latency was more than 3 ms. Max observed latency was 71 ms.)
> With the patch:
> T: 0 ( 2588) P:80 I:5000 C: 20000 Min: 13 Act: 33 Avg: 49 Max: 11009
> (i.e. Average latency was only 49 us. Max observed latency was 11 ms.)

Have you tested for any performance regressions as a result of this
patch, using some file system benchmarks?

I don't think this is the best way to fix this problem, though. The
real right answer is to change how the code is structued. All of the
callsites that call mpage_da_submit_io() are immediately preceeded by
mpage_da_map_blocks(). These two functions should be combined and
instead of calling ext4_writepage() for each page,
mpage_da_map_and_write_blocks() should make a single call to
submit_bio() for each extent. That should far more CPU efficient,
solving both your scheduling latency issue as well as helping out for
benchmarks that strive to stress both the disk and CPU simultaneously
(such as for example the TPC benchmarks).

This will also make our blktrace results much more compact, and Chris
Mason will be very happy about that!

- Ted

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at