Re: Ext3 vs NTFS performance

From: Valerie Henson
Date: Thu May 03 2007 - 17:15:33 EST


On Thu, May 03, 2007 at 01:44:14AM +1000, David Chinner wrote:
> On Tue, May 01, 2007 at 01:43:18PM -0700, Cabot, Mason B wrote:
> > Hello all,
> >
> > I've been testing the NAS performance of ext3/Openfiler 2.2 against
> > NTFS/WinXP and have found that NTFS significantly outperforms ext3 for
> > video workloads. The Windows CIFS client will attempt a poor-man's
> > pre-allocation of the file on the server by sending 1-byte writes at
> > 128K-byte strides, breaking block allocation on ext3 and leading to
> > fragmentation and poor performance. This will happen for many
> > applications (including iTunes) as the CIFS client issues these
> > pre-allocates under the application layer.
> >
> > I've posted a brief paper on Intel's OSS website
> > (http://softwarecommunity.intel.com/articles/eng/1259.htm). Please give
> > it a read and let me know what you think. In particular, I'd like to
> > arrive at the right place to fix this problem: is it in the filesystem,
> > VFS, or Samba?
>
> As I commented on IRC to Val Henson - the XFS performance indicates
> that it is not a VFS or Samba problem.

In terms of what piece of code we can swap out and get good
performance, the problem is indeed in ext3 - it's clear that the cause
of the bad performance is the 1-byte writes resulting in ext3
fragmenting the on-disk layout of the file, and replacing it with XFS
results in nice, clean, unfragmented files.

But in terms of what we should do to fix it, there is the possibility
of some debate. In general, I think there is a lot of code stuck down
in individual file systems - especially in XFS - that could be
usefully hoisted up to a higher level as generic helper functions.
For example, we've got at least two implementations of reservations,
one in XFS and one in ext3/4. At least some of the code could be
generic - both file systems want to reserve long contiguous extents -
with the actual mechanics of looking up and reserving free blocks
implemented in per-fs code.

I'd really like to see a generic VFS-level detection of
read()/write()/creat()/mkdir()/etc. patterns which could detect things
like "Oh, this file is likely to be deleted immediately, wait and see
if it goes away and don't bother sending it on to the FS immediately"
or "Looks like this file will grow pretty big, let's go pre-allocate
some space for it." This is probably best done as a set of helper
functions in the usual way.

For this particular case, Ted is probably right and the only place
we'll ever see this insane poor man's pre-allocate pattern is from the
Windows CIFS client, in which case fixing this in Samba makes sense -
although I'm a bit horrified by the idea of writing 128K of zeroes to
pre-allocate... oh well, it's temporary, and what we care about here
is the read performance, more than the write performance.

-VAL
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/