Re: xfs: does mkfs.xfs require fancy switches to get decent performance? (was Tux3 Report: How fast can we fsync?)

From: Daniel Phillips
Date: Tue May 12 2015 - 16:55:00 EST


On 05/12/2015 11:39 AM, David Lang wrote:
> On Mon, 11 May 2015, Daniel Phillips wrote:
>>> ...it's the mm and core kernel developers that need to
>>> review and accept that code *before* we can consider merging tux3.
>>
>> Please do not say "we" when you know that I am just as much a "we"
>> as you are. Merging Tux3 is not your decision. The people whose
>> decision it actually is are perfectly capable of recognizing your
>> agenda for what it is.
>>
>> http://www.phoronix.com/scan.php?page=news_item&px=MTA0NzM
>> "XFS Developer Takes Shots At Btrfs, EXT4"
>
> umm, Phoronix has no input on what gets merged into the kernel. they also hae a reputation for
> trying to turn anything into click-bait by making it sound like a fight when it isn't.

Perhaps you misunderstood. Linus decides what gets merged. Andrew
decides. Greg decides. Dave Chinner does not decide, he just does
his level best to create the impression that our project is unfit
to merge. Any chance there might be an agenda?

Phoronix published a headline that identifies Dave Chinner as
someone who takes shots at other projects. Seems pretty much on
the money to me, and it ought to be obvious why he does it.

>> The real question is, has the Linux development process become
>> so political and toxic that worthwhile projects fail to benefit
>> from supposed grassroots community support. You are the poster
>> child for that.
>
> The linux development process is making code available, responding to concerns from the experts in
> the community, and letting the code talk for itself.

Nice idea, but it isn't working. Did you let the code talk to you?
Right, you let the code talk to Dave Chinner, then you listen to
what Dave Chinner has to say about it. Any chance that there might
be some creative licence acting somewhere in that chain?

> There have been many people pushing code for inclusion that has not gotten into the kernel, or has
> not been used by any distros after it's made it into the kernel, in spite of benchmarks being posted
> that seem to show how wonderful the new code is. ReiserFS was one of the first, and part of what
> tarnished it's reputation with many people was how much they were pushing the benchmarks that were
> shown to be faulty (the one I remember most vividly was that the entire benchmark completed in <30
> seconds, and they had the FS tuned to not start flushing data to disk for 30 seconds, so the entire
> 'benchmark' ran out of ram without ever touching the disk)

You know what to do about checking for faulty benchmarks.

> So when Ted and Dave point out problems with the benchmark (the difference in behavior between a
> single spinning disk, different partitions on the same disk, SSDs, and ramdisks), you would be
> better off acknowledging them and if you can't adjust and re-run the benchmarks, don't start
> attacking them as a result.

Ted and Dave failed to point out any actual problem with any
benchmark. They invented issues with benchmarks and promoted those
as FUD.

> As Dave says above, it's not the other filesystem people you have to convince, it's the core VFS and
> Memory Mangement folks you have to convince. You may need a little benchmarking to show that there
> is a real advantage to be gained, but the real discussion is going to be on the impact that page
> forking is going to have on everything else (both in complexity and in performance impact to other
> things)

Yet he clearly wrote "we" as if he believes he is part of it.

Now that ENOSPC is done to a standard way beyond what Btrfs had
when it was merged, the next item on the agenda is writeback. That
involves us and VFS people as you say, and not Dave Chinner, who
only intends to obstruct the process as much as he possibly can. He
should get back to work on his own project. Nobody will miss his
posts if he doesn't make them. They contribute nothing of value,
create a lot of bad blood, and just serve to further besmirch the
famously tarnished reputation of LKML.

>> You know that Tux3 is already fast. Not just that of course. It
>> has a higher standard of data integrity than your metadata-only
>> journalling filesystem and a small enough code base that it can
>> be reasonably expected to reach the quality expected of an
>> enterprise class filesystem, quite possibly before XFS gets
>> there.
>
> We wouldn't expect anyone developing a new filesystem to believe any differently.

It is not a matter of belief, it is a matter of testable fact. For
example, you can count the lines. You can run the same benchmarks.

Proving the data consistency claims would be a little harder, you
need tools for that, and some of those aren't built yet. Or, if you
have technical ability, you can read the code and the copious design
material that has been posted and convince yourself that, yes, there
is something cool here, why didn't anybody do it that way before?
But of course that starts to sound like work. Debating nontechnical
issues and playing politics seems so much more like fun.

> If they didn't
> believe this, why would they be working on the filesystem instead of just using an existing filesystem.

Right, and it is my job to convince you that what I believe for
perfectly valid, demonstrable technical reasons, is really true. I do
not see why you feel it is your job to convince me that the obviously
broken Linux community process is not in fact broken, and that a
certain person who obviously has an agenda, is not actually obstructing.

> The ugly reality is that everyone's early versions of their new filesystem looks really good. The
> problem is when they extend it to cover the corner cases and when it gets stressed by real-world (as
> opposed to benchmark) workloads. This isn't saying that you are wrong in your belief, just that you
> may not be right, and nobody will know until you are to a usable state and other people can start
> beating on it.

With ENOSPC we are at that state. Tux3 would get more testing and advance
faster if it was merged. Things like ifdefs, grandiose new schemes for
writeback infrastructure, dumb little hooks in the mkwrite path, those
are all just manufactured red herrings. Somebody wanted those to be
issues, so now they are issues. Fake ones.

Nobody is trying to trick you. Just stating a fact. You ought to be able
to figure out by now that Tux3 is worth merging.

You might possibly have an argument that merging a filesystem that
crashes as soon as it fills the disk is just sheer stupidity than can
only lead to embarrassment in the long run, but then you would need to
explain why Btrfs was merged. As I recall, it went something like, Chris
had it on a laptop, so it must be a filesystem, and wow look at that
feature list. Then it got merged in a completely unusable state and got
worked on. If it had not been merged, Btrfs would most likely be dead
right now. After all, who cares about an out of tree filesystem?

By the way, I gave my Tux3 presentation at SCALE 7x in Los Angeles in
2009, with Tux3 running as my root filesystem. By the standard applied
to Btrfs, Tux3 should have been merged then, right? After all, our
nospace handling worked just as well as theirs at that time.

Regards,

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/