Re: tuning for large files in xfs

From: Nathan Scott
Date: Mon May 22 2006 - 21:59:36 EST


Hi Tim,

On Mon, May 22, 2006 at 05:49:43PM -0700, fitzboy wrote:
> Nathan Scott wrote:
> > Can you send xfs_info output for the filesystem and the output
> > from xfs_bmap -vvp on this file?
> xfs_info:
> meta-data=/mnt/array/disk1 isize=2048 agcount=410, agsize=524288

Thats odd - why such a large number of allocation groups?

> blks
> = sectsz=512
> data = bsize=4096 blocks=214670562, imaxpct=25
> = sunit=16 swidth=192 blks, unwritten=1
> naming =version 2 bsize=4096
> log =internal bsize=4096 blocks=8192, version=1
> = sectsz=512 sunit=0 blks
> realtime =none extsz=65536 blocks=0, rtextents=0
>
> it is mounted rw,noatime,nodiratime
> generally I am doing 32k reads from the application, so I would like
> larger blocksize (32k would be ideal), but can't go above 2k on intel...

4K you mean (thats what you've got above, and thats your max with
a 4K pagesize).

I thought you said you had a 2TB file? The filesystem above is
4096 * 214670562 blocks, i.e. 818GB. Perhaps its a sparse file?
I guess I could look closer at the bmap and figure that out for
myself. ;)

> I made the file my copying it over via dd from another machine onto a
> clean partition... then from that point we just append to the end of it,
> or modify existing data...

> I am attaching the extent map

Interesting. So, the allocator did an OK job for you, at least
initially - everything is contiguous (within the bounds of the
small agsize you set) until extent #475, and I guess that'd have
been the end of the initial copied file. After that it chops
about a bit (goes back to earlier AGs and uses the small amounts
of space in each I'm guessing), then gets back into nice long
contiguous extent allocations in the high AGs.

Anyway, you should be able to alleviate the problem by:

- Using a small number of larger AGs (say 32 or so) instead of
a large number of small AGs. this'll give you most bang for
your buck I expect.
[ If you use a mkfs.xfs binary from an xfsprogs anytime since
November 2003, this will automatically scale for you - did you
use a very old mkfs? Or set the agcount/size values by hand?
Current mkfs would give you this:
# mkfs.xfs -isize=2k -dfile,name=/dev/null,size=214670562b -N
meta-data=/dev/null isize=2048 agcount=32, agsize=6708455 blks
...which is just what you want here. ]

- Preallocate the space in the file - i.e. before running the
dd you can do an "xfs_io -c 'resvsp 0 2t' /mnt/array/disk1/xxx"
(preallocates 2 terabytes) and then overwrite that. Yhis will
give you an optimal layout.

- Not sure about your stripe unit/width settings, I would need
to know details about your RAID. But maybe theres tweaking that
could be done there too.

- Your extent map is fairly large, the 2.6.17 kernel will have
some improvements in the way the memory management is done here
which may help you a bit too.

Have fun!

cheers.

--
Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/