Re: [PATCH 00/11] IO-less dirty throttling v12

From: Wu Fengguang
Date: Mon Oct 10 2011 - 10:29:00 EST


Hi Trond,

> As for the NFS performance, the dd tests show that adding a writeback
> wait queue to limit the number of NFS PG_writeback pages (patches
> will follow) is able to gain 48% throughput in itself:
>
> 3.1.0-rc8-ioless6+ 3.1.0-rc8-nfs-wq+
> ------------------------ ------------------------
> 22.43 +81.8% 40.77 NFS-thresh=100M/nfs-10dd-1M-32p-32768M-100M:10-X
> 28.21 +52.6% 43.07 NFS-thresh=100M/nfs-1dd-1M-32p-32768M-100M:10-X
> 29.21 +55.4% 45.39 NFS-thresh=100M/nfs-2dd-1M-32p-32768M-100M:10-X
> 14.12 +40.4% 19.83 NFS-thresh=10M/nfs-10dd-1M-32p-32768M-10M:10-X
> 29.44 +11.4% 32.81 NFS-thresh=10M/nfs-1dd-1M-32p-32768M-10M:10-X
> 9.09 +240.9% 30.97 NFS-thresh=10M/nfs-2dd-1M-32p-32768M-10M:10-X
> 25.68 +84.6% 47.42 NFS-thresh=1G/nfs-10dd-1M-32p-32768M-1024M:10-X
> 41.06 +7.6% 44.20 NFS-thresh=1G/nfs-1dd-1M-32p-32768M-1024M:10-X
> 39.13 +25.9% 49.26 NFS-thresh=1G/nfs-2dd-1M-32p-32768M-1024M:10-X
> 238.38 +48.4% 353.72 TOTAL
>
> Which will result in 28% overall improvements over the vanilla kernel:
>
> 3.1.0-rc4-vanilla+ 3.1.0-rc8-nfs-wq+
> ------------------------ ------------------------
> 20.89 +95.2% 40.77 NFS-thresh=100M/nfs-10dd-1M-32p-32768M-100M:10-X
> 39.43 +9.2% 43.07 NFS-thresh=100M/nfs-1dd-1M-32p-32768M-100M:10-X
> 26.60 +70.6% 45.39 NFS-thresh=100M/nfs-2dd-1M-32p-32768M-100M:10-X
> 12.70 +56.1% 19.83 NFS-thresh=10M/nfs-10dd-1M-32p-32768M-10M:10-X
> 27.41 +19.7% 32.81 NFS-thresh=10M/nfs-1dd-1M-32p-32768M-10M:10-X
> 26.52 +16.8% 30.97 NFS-thresh=10M/nfs-2dd-1M-32p-32768M-10M:10-X
> 40.70 +16.5% 47.42 NFS-thresh=1G/nfs-10dd-1M-32p-32768M-1024M:10-X
> 45.28 -2.4% 44.20 NFS-thresh=1G/nfs-1dd-1M-32p-32768M-1024M:10-X
> 35.74 +37.8% 49.26 NFS-thresh=1G/nfs-2dd-1M-32p-32768M-1024M:10-X
> 275.28 +28.5% 353.72 TOTAL
>
> As for the most concerned NFS commits, the wait queue patch increases
> the (nr_commits / bytes_written) ratio by +74% for the thresh=1G,10dd
> case, +55% for the thresh=100M,10dd case, and mostly ignorable in the
> other 1dd, 2dd cases, which looks acceptable.
>
> The other noticeable change of the wait queue is, the RTT time per

Sorry it's not RTT, but mainly the local queue time of the WRITE RPCs.

> write is reduced by 1-2 order(s) in many of the below cases (from
> dozens of seconds to hundreds of milliseconds).

I also measured the stddev of the network bandwidths, and find more
smooth network transfers in general with the wait queue, which is
expected.

thresh=1G
vanilla ioless6 nfs-wq
1dd 83088173.728 53468627.578 53627922.011
2dd 52398918.208 43733074.167 53531381.177
10dd 67792638.857 44734947.283 39681731.234

However the major difference should still be that the writeback wait
queue can significantly reduce the local queue time for the WRITE RPCs.

The wait queue patch looks reasonable in that it keeps the pages in
PG_dirty state rather than to prematurely put them to PG_writeback
state only to queue them up for dozens of seconds before xmit.

It should be safe because that's exactly the old proved behavior
before the per-bdi writeback patches introduced in 2.6.32. The 2nd
patch on proportional nfs_congestion_kb is a new change, though.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/