Re: XFS vs Elevators (was Re: [PATCH RFC] nilfs2: continuous snapshotting file system)

From: Martin Steigerwald
Date: Thu Aug 21 2008 - 07:36:52 EST


Am Donnerstag 21 August 2008 schrieb Dave Chinner:
> On Thu, Aug 21, 2008 at 04:04:18PM +1000, Dave Chinner wrote:
> > On Thu, Aug 21, 2008 at 03:15:08PM +1000, Dave Chinner wrote:
> > > On Thu, Aug 21, 2008 at 05:46:00AM +0300, Szabolcs Szakacsits wrote:
> > > > On Thu, 21 Aug 2008, Dave Chinner wrote:
> > > > Everything is default.
> > > >
> > > > % rpm -qf =mkfs.xfs
> > > > xfsprogs-2.9.8-7.1
> > > >
> > > > which, according to ftp://oss.sgi.com/projects/xfs/cmd_tars, is
> > > > the latest stable mkfs.xfs. Its output is
> > > >
> > > > meta-data=/dev/sda8 isize=256 agcount=4,
> > > > agsize=1221440 blks = sectsz=512 attr=2
> > > > data = bsize=4096 blocks=4885760,
> > > > imaxpct=25 = sunit=0 swidth=0 blks
> > > > naming =version 2 bsize=4096
> > > > log =internal log bsize=4096 blocks=2560,
> > > > version=2 = sectsz=512 sunit=0 blks,
> > > > lazy-count=0 realtime =none extsz=4096
> > > > blocks=0, rtextents=0
> > >
> > > Ok, I thought it might be the tiny log, but it didn't improve
> > > anything here when increased the log size, or the log buffer size.
> >
> > One thing I just found out - my old *laptop* is 4-5x faster than the
> > 10krpm scsi disk behind an old cciss raid controller. I'm wondering
> > if the long delays in dispatch is caused by an interaction with CTQ
> > but I can't change it on the cciss raid controllers. Are you using
> > ctq/ncq on your machine? If so, can you reduce the depth to
> > something less than 4 and see what difference that makes?
>
> Just to point out - this is not a new problem - I can reproduce
> it on 2.6.24 as well as 2.6.26. Likewise, my laptop shows XFS
> being faster than ext3 on both 2.6.24 and 2.6.26. So the difference
> is something related to the disk subsystem on the server....

Interesting. I switched from cfq to deadline some time ago, due to abysmal
XFS performance on parallel IO - aptitude upgrade and doing desktop
stuff. Just my subjective perception, but I have seen it crawl, even
stall for 5-10 seconds easily at times. I found deadline to be way faster
initially, but then it rarely happened that IO for desktop tasks is
basically stalled for even longer, say 15 seconds or more, on parallel
IO. However I can't remember having this problem with the last kernel
2.6.26.2.

I am now testing with cfq again. On a ThinkPad T42 internal 160 GB
harddisk with barriers enabled. But you tell, it only happens on certain
servers, so I might have seen something different.

Thus I had the rough feeling that something is wrong with at least CFQ and
XFS together, but I couldn't prove it back then. I have no idea how to
easily do a reproducable test case. Maybe having a script that unpacks
kernel source archives while I try to use the desktop...

--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/