Re: Linux 2.6.29

From: Jeff Garzik
Date: Fri Apr 03 2009 - 04:32:16 EST


Linus Torvalds wrote:

On Thu, 2 Apr 2009, Jeff Garzik wrote:
The most interesting thing I found: the SSD does 80 MB/s for the first ~1 GB
or so, then slows down dramatically. After ~2GB, it is down to 32 MB/s.
After ~4GB, it reaches a steady speed around 23 MB/s.

Are you sure that isn't an effect of double and triple indirect blocks etc? The metadata updates get more complex for the deeper indirect blocks.

Or just our page cache lookup? Maybe our radix tree thing hits something stupid. Although it sure shouldn't be _that_ noticeable.

There is a similar performance fall-off for the Seagate, but much less
pronounced:
After 1GB: 52 MB/s
After 2GB: 44 MB/s
After 3GB: steady state

That would seem to indicate that it's something else than the disk speed.

Attached are some additional tests using sync_file_range, dd, an SSD and a normal SATA disk. The test program -- overwrite.c -- is unchanged from my last posting, basically the same as Linus's except with posix_fadvise()

Observations:

* the no-name SSD does seem to burst the first ~1GB of writes rapidly, but degrades to a much lower sustained level, as observed before. Repeated tests do not produce ~80 MB/s, only the first test, which lends credence to the theory about background activity.

* For the SSD, overwrite is noticeably faster than dd.

* For the Seagate NCQ hard drive, dd is noticeably faster than overwrite.

* fadvise() appears to help, but mostly the results are either inconclusive or lost in the noise: A slight increase in throughput, and a slight increase in system time.

The test sequence for both SATA devices was the following:

3 x dd
3 x overwrite
3 x overwrite w/ fadvise(don't need)

System setup: Intel Nahalem(sp?) x86-64, ICH10, Fedora 10, ext3 filesystem (mounted defaults + noatime), 2.6.29 vanilla kernel.

Regards,

Jeff





=======================================================
128GB, 3.0 Gbps no-name SATA SSD, x86-64, ext3, 2.6.29 vanilla

First dd(1) creates the file, others simply rewrite it.
=======================================================
24000+0 records in
24000+0 records out
25165824000 bytes (25 GB) copied, 917.599 s, 27.4 MB/s)

real 15m30.928s
user 0m0.016s
sys 1m3.924s


24000+0 records in
24000+0 records out
25165824000 bytes (25 GB) copied, 1056.92 s, 23.8 MB/s)

real 18m1.686s
user 0m0.016s
sys 1m4.816s


24000+0 records in
24000+0 records out
25165824000 bytes (25 GB) copied, 1044.25 s, 24.1 MB/s)

real 17m37.884s
user 0m0.020s
sys 1m4.300s


writing 2800 buffers of size 8m
21.867 GB written in 645.56 (34 MB/s)

real 10m46.502s
user 0m0.044s
sys 0m35.990s


writing 2800 buffers of size 8m
21.867 GB written in 634.55 (35 MB/s)

real 10m35.448s
user 0m0.036s
sys 0m36.466s


writing 2800 buffers of size 8m
21.867 GB written in 642.00 (34 MB/s)

real 10m42.890s
user 0m0.044s
sys 0m34.930s


using fadvise()
writing 2800 buffers of size 8m
21.867 GB written in 639.49 (35 MB/s)

real 10m40.384s
user 0m0.036s
sys 0m38.582s


using fadvise()
writing 2800 buffers of size 8m
21.867 GB written in 636.17 (35 MB/s)

real 10m37.061s
user 0m0.024s
sys 0m39.146s


using fadvise()
writing 2800 buffers of size 8m
21.867 GB written in 636.07 (35 MB/s)

real 10m37.003s
user 0m0.060s
sys 0m39.174s


=======================================================
500GB, 3.0Gbps Seagate SATA drive, x86-64, ext3, 2.6.29 vanilla

First dd(1) creates the file, others simply rewrite it.
=======================================================
24000+0 records in
24000+0 records out
25165824000 bytes (25 GB) copied, 494.797 s, 50.9 MB/s)

real 8m42.680s
user 0m0.016s
sys 0m58.176s


24000+0 records in
24000+0 records out
25165824000 bytes (25 GB) copied, 498.295 s, 50.5 MB/s)

real 8m27.505s
user 0m0.016s
sys 0m58.744s


24000+0 records in
24000+0 records out
25165824000 bytes (25 GB) copied, 492.145 s, 51.1 MB/s)

real 8m23.616s
user 0m0.016s
sys 0m59.064s


writing 2800 buffers of size 8m
21.867 GB written in 478.41 (46 MB/s)

real 7m59.690s
user 0m0.032s
sys 0m33.210s


writing 2800 buffers of size 8m
21.867 GB written in 513.54 (43 MB/s)

real 8m34.461s
user 0m0.048s
sys 0m33.342s


writing 2800 buffers of size 8m
21.867 GB written in 471.38 (47 MB/s)

real 7m52.641s
user 0m0.020s
sys 0m33.486s


using fadvise()
writing 2800 buffers of size 8m
21.867 GB written in 467.67 (47 MB/s)

real 7m48.756s
user 0m0.048s
sys 0m36.838s


using fadvise()
writing 2800 buffers of size 8m
21.867 GB written in 462.69 (48 MB/s)

real 7m43.597s
user 0m0.020s
sys 0m37.462s


using fadvise()
writing 2800 buffers of size 8m
21.867 GB written in 463.56 (48 MB/s)

real 7m44.472s
user 0m0.036s
sys 0m37.342s


Attachment: run-test.sh
Description: Bourne shell script