On Tue, Mar 15, 2016 at 05:51:17PM -0700, Chris Mason wrote:
On Tue, Mar 15, 2016 at 07:30:14PM -0500, Eric Sandeen wrote:So I re-ran some benchmarks, with 4K O_DIRECT random ios on nvme (4.5
On 3/15/16 7:06 PM, Linus Torvalds wrote:When I've benchmarked this in the past, doing small random buffered writes
On Tue, Mar 15, 2016 at 4:52 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:I've been away from ext4 for a while, so I'm really not on top of the
I agree, but quite frankly, performance is a good justification.It is pretty clear that the onus is on the patch submitter to
provide justification for inclusion, not for the reviewer/Maintainer
to have to prove that the solution is unworkable.
So if Ted can give performance numbers, that's justification enough.
We've certainly taken changes with less.
mechanics of the underlying problem at the moment.
But I would say that in addition to numbers showing that ext4 has trouble
with unwritten extent conversion, we should have an explanation of
why it can't be solved in a way that doesn't open up these concerns.
XFS certainly has different mechanisms, but is the demonstrated workload
problematic on XFS (or btrfs) as well? If not, can ext4 adopt any of the
solutions that make the workload perform better on other filesystems?
into an preallocated extent was dramatically (3x or more) slower on xfs
than doing them into a fully written extent. That was two years ago,
but I can redo it.
kernel). This is O_DIRECT without O_SYNC. I don't think xfs will do
commits for each IO into the prealloc file? O_SYNC makes it much
slower, so hopefully I've got this right.
The test runs for 60 seconds, and I used an iodepth of 4:
prealloc file: 32,000 iops
overwrite: 121,000 iops
If I bump the iodepth up to 512:
prealloc file: 33,000 iops
overwrite: 279,000 iops
For streaming writes, XFS converts prealloc to written much better when
the IO isn't random. You can start seeing the difference at 16K
sequential O_DIRECT writes, but really its not a huge impact. The worst
case is 4K:
prealloc file: 227MB/s
overwrite: 340MB/s
I can't think of sequential workloads where this will matter, since they
will either end up with bigger IO or the performance impact won't get
noticed.
-chris