Re: howto combat highly pathologic latencies on a server?

From: Hans-Peter Jansen
Date: Thu Mar 11 2010 - 11:59:10 EST


On Thursday 11 March 2010, 00:29:40 Dave Chinner wrote:
> On Wed, Mar 10, 2010 at 06:17:42PM +0100, Hans-Peter Jansen wrote:
> >
> > The xfs filesystems are mounted with rw,noatime,attr2,nobarrier,noquota
> > (yes, I do have a BBU on the areca, and disk write cache is effectively
> > turned off).
>
> Make sure the filesystem has the "lazy-count=1" attribute set (use
> xfs_info to check, xfs_admin to change). That will remove the
> superblock from most transactions and significant reduce latency of
> transactions as they serialise while locking it...

Done that now on my local test system, but on one of its filesystems,
xfs_admin -c1 didn't succeed, it simply stopped (waiting for a futex):

Famous last syscall:
6750 futex(0x868330c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>

Consequently, xfs_repair behaved similar, hanging in phase 6, traversing
filesystem... I have a huge strace from this run, if someone is interested.

It's an 3 TB Raid 5 array (4 * 1 TB hd) with one FS also driven by areca:

meta-data=/dev/sdb1 isize=256 agcount=4, agsize=183105406
blks
= sectsz=512 attr=2
data = bsize=4096 blocks=732421623, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=32768, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

Luckily, xfs_repair -P finally did succeed. Phuah..

This is with: xfs_repair version 2.10.1.

After calling xfs_admin -c1, all filesystems showed differences in
superblock features (from a xfs_repair -n run). Is xfs_repair mandatory, or
does the initial mount fix this automatically?

Thanks,
Pete
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/