Re: xfs: does mkfs.xfs require fancy switches to get decent performance? (was Tux3 Report: How fast can we fsync?)

From: Daniel Phillips
Date: Wed May 13 2015 - 07:30:59 EST

On 05/13/2015 12:25 AM, Pavel Machek wrote:
> On Mon 2015-05-11 16:53:10, Daniel Phillips wrote:
>> Hi Pavel,
>> On 05/11/2015 03:12 PM, Pavel Machek wrote:
>>>>> It is a fact of life that when you change one aspect of an intimately interconnected system,
>>>>> something else will change as well. You have naive/nonexistent free space management now; when you
>>>>> design something workable there it is going to impact everything else you've already done. It's an
>>>>> easy bet that the impact will be negative, the only question is to what degree.
>>>> You might lose that bet. For example, suppose we do strictly linear allocation
>>>> each delta, and just leave nice big gaps between the deltas for future
>>>> expansion. Clearly, we run at similar or identical speed to the current naive
>>>> strategy until we must start filling in the gaps, and at that point our layout
>>>> is not any worse than XFS, which started bad and stayed that way.
>>> Umm, are you sure. If "some areas of disk are faster than others" is
>>> still true on todays harddrives, the gaps will decrease the
>>> performance (as you'll "use up" the fast areas more quickly).
>> That's why I hedged my claim with "similar or identical". The
>> difference in media speed seems to be a relatively small effect
> When you knew it can't be identical? That's rather confusing, right?

Maybe. The top of thread is about a measured performance deficit of
a factor of five. Next to that, a media transfer rate variation by
a factor of two already starts to look small, and gets smaller when

Let's say our delta size is 400MB (typical under load) and we leave
a "nice big gap" of 112 MB after flushing each one. Let's say we do
two thousand of those before deciding that we have enough information
available to switch to some smarter strategy. We used one GB of a
a 4TB disk, say. The media transfer rate decreased by a factor of:

(1 - 2/1000) = .2%.

The performance deficit in question and the difference in media rate are
three orders of magnitude apart, does that justify the term "similar or

> Perhaps you should post more details how your benchmark is structured
> next time, so we can see you did not make any trivial mistakes...?

Makes sense to me, though I do take considerable care to ensure that
my results are reproducible. That is born out by the fact that Mike
did reproduce, albeit from the published branch, which is a bit behind
current work. And he went on to do some original testing of his own.

I had no idea Tux3 was so much faster than XFS on the Git self test,
because we never specifically tested anything like that, or optimized
for it. Of course I was interested in why. And that was not all, Mike
also noticed a really interesting fact about latency that I failed to
reproduce. That went on to the list of things to investigate as time

I reproduced Mike's results according to his description, by actually
building Git in the VM and running the selftests just to see if the same
thing happened, which it did. I didn't think that was worth mentioning
at the time, because if somebody publishes benchmarks, my first instinct
is to trust them. Trust and verify.

> Or just clean the code up so that it can get merged, so that we can
> benchmark ourselves...

Third possibility: build from our repository, as Mike did. Obviously,
we need to merge to master so the build process matches the Wiki. But
Hirofumi is busy with other things, so please be patient.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at