Re: xfs: very slow after mount, very slow at umount

From: Stan Hoeppner
Date: Thu Jan 27 2011 - 14:49:53 EST

Mark Lord put forth on 1/27/2011 10:03 AM:
> On 11-01-27 10:40 AM, Justin Piszcz wrote:
>> On Thu, 27 Jan 2011, Mark Lord wrote:
> ..
>>> Can you recommend a good set of mkfs.xfs parameters to suit the characteristics
>>> of this system? Eg. Only a few thousand active inodes, and nearly all files are
>>> in the 600MB -> 20GB size range. The usage pattern it must handle is up to
>>> six concurrent streaming writes at the same time as up to three streaming reads,
>>> with no significant delays permitted on the reads.
>>> That's the kind of workload that I find XFS handles nicely,
>>> and EXT4 has given me trouble with in the past.
> ..
>> I did a load of benchmarks a long time ago testing every mkfs.xfs option there
>> was, and I found that most of the time (if not all), the defaults were the best.
> ..
> I am concerned with fragmentation on the very special workload in this case.
> I'd really like the 20GB files, written over a 1-2 hour period, to consist
> of a very few very large extents, as much as possible.

For XFS that's actually not a special case workload but an average one. XFS was
conceived at SGI for use on large supercomputers where typical single file
datasets are extremely large, i.e. hundreds of GB. Also note that the real time
sub volume feature was created for almost exactly your purpose: streaming
record/playback of raw A/V data for broadcast (i.e. television). In your case
it's compressed, not raw A/V data. I'm not recommending you use the real time
feature however, as it's overkill for MythTV and not necessary.

> Rather than hundreds or thousands of "tiny" MB sized extents.
> I wonder what the best mkfs.xfs parameters might be to encourage that?

You need to use the mkfs.xfs defaults for any single drive filesystem, and trust
the allocator to do the right thing. XFS uses variable size extents and the
size is chosen dynamically--you don't have direct or indirect control of the
extent size chosen for a given file or set of files AFAIK.

As Dave Chinner is fond of pointing out, it's those who don't know enough about
XFS and choose custom settings that most often get themselves into trouble (as
you've already done once). :)

The defaults exist for a reason, and they weren't chosen willy nilly. The vast
bulk of XFS' configurability exists for tuning maximum performance on large to
very large RAID arrays. There isn't much, if any, additional performance to be
gained with parameter tweaks on a single drive XFS filesystem.

A brief explanation of agcount: the filesystem is divided into agcount regions
called allocation groups, or AGs. The allocator writes to all AGs in parallel
to increase performance. With extremely fast storage (SSD, large high RPM RAID)
this increases throughput as the storage can often sink writes faster than a
serial writer can push data. In your case, you have a single slow spindle with
over 7,000 AGs. Thus, the allocator is writing to over 7,000 locations on that
single disk simultaneously, or, at least, it's trying to. Thus, the poor head
on that drive is being whipped all over the place without actually getting much
writing done. To add insults to injury, this is one of these low RPM low head
performance "green" drives correct?

Trust the defaults. If they give you problems (unlikely) then we can't talk. ;)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at