Re: O_DIRECT to md raid 6 is slow

From: Miquel van Smoorenburg
Date: Fri Aug 17 2012 - 07:16:37 EST


On 08/17/2012 09:31 AM, Stan Hoeppner wrote:
On 8/16/2012 4:50 PM, Miquel van Smoorenburg wrote:
I did a simple test:

* created a 1G partition on 3 seperate disks
* created a md raid5 array with 512K chunksize:
mdadm -C /dev/md0 -l 5 -c $((1024*512)) -n 3 /dev/sdb1 /dev/sdc1
/dev/sdd1
* ran disk monitoring using 'iostat -k 5 /dev/sdb1 /dev/sdc1 /dev/sdd1'
* wrote a single 4K block:
dd if=/dev/zero bs=4K count=1 oflag=direct seek=30 of=/dev/md0

Output from iostat over the period in which the 4K write was done. Look
at kB read and kB written:

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sdb1 0.60 0.00 1.60 0 8
sdc1 0.60 0.80 0.80 4 4
sdd1 0.60 0.00 1.60 0 8

As you can see, a single 4K read, and a few writes. You see a few blocks
more written that you'd expect because the superblock is updated too.

I'm no dd expert, but this looks like you're simply writing a 4KB block
to a new stripe, using an offset, but not to an existing stripe, as the
array is in a virgin state. So it doesn't appear this test is going to
trigger RMW. Don't you need now need to do another write in the same
stripe to to trigger RMW? Maybe I'm just reading this wrong.

That shouldn't matter, but that is easily checked ofcourse, by writing some random random data first, then doing the dd 4K write also with random data somewhere in the same area:

# dd if=/dev/urandom bs=1M count=3 of=/dev/md0
3+0 records in
3+0 records out
3145728 bytes (3.1 MB) copied, 0.794494 s, 4.0 MB/s

Now the first 6 chunks are filled with random data, let write 4K somewhere in there:

# dd if=/dev/urandom bs=4k count=1 seek=25 of=/dev/md0
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.10149 s, 40.4 kB/s

Output from iostat over the period in which the 4K write was done:

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sdb1 0.60 0.00 1.60 0 8
sdc1 0.60 0.80 0.80 4 4
sdd1 0.60 0.00 1.60 0 8

Mike.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/