Re: [GIT PULL] ext4 updates for 3.11

From: Dilger, Andreas
Date: Wed Jul 03 2013 - 14:40:48 EST


On 2013/03/07 12:12 PM, "Greg KH" <greg@xxxxxxxxx> wrote:

>On Wed, Jul 03, 2013 at 01:29:41PM +1000, Dave Chinner wrote:
>> On Tue, Jul 02, 2013 at 06:01:11PM -0700, Greg KH wrote:
>> > On Tue, Jul 02, 2013 at 05:58:15PM -0700, Linus Torvalds wrote:
>> > > On Tue, Jul 2, 2013 at 5:54 PM, Greg KH <greg@xxxxxxxxx> wrote:
>> > > > On Tue, Jul 02, 2013 at 05:02:21PM -0700, Linus Torvalds wrote:
>> > > >>
>> > > >> I'm really not convinced this whole Lustre thing was correctly
>> > > >> handled. Merging it into stable and yet being in such bad shape
>>that
>> > > >> it isn't enabled even there? I just dunno. But I have the turd
>>in my
>> > > >> tree now, let's hope it gets fixed up.
>> > > >
>> > > > It's in "staging", not "stable" :)
>> > >
>> > > Yes. But what was the reason to actually merge it even there? And
>>once
>> > > it gets merged, disabling it again rather than fixing the problems
>>it
>> > > has?
>> >
>> > The problems turned out to be too big, too late in the merge cycle for
>> > me to be able to take them (they still aren't even done, as I don't
>>have
>> > a working set of patches yet.) So I just disabled it from the build
>>to
>> > give Andreas and team time to get it working properly.

In our defence, the code has been working fine for years, but only on
vendor
kernels, so we are playing catch-up to the mainline kernel, and hit a
bunch of
snags when merging into -next.

Also, all of the configure checks have been removed from the version
submitted
to the kernel, so this caused some breakage on platforms that Lustre
actually
runs on regularly (e.g. PPC). On the flip side, nobody ever uses Lustre
on S390
or 32-bit clients, so it is no surprise that there were problems there.

>> > I could have just removed it, but I thought I would give them a
>>chance.

Thanks. The code is just too big to get it ready for inclusion in one
piece,
and the only way that we can make it acceptable for mainline kernel
inclusion
is through -staging and incrementally cleaning it up.

>> > > This is a filesystem that Intel apparently wants to push. I think it
>> > > would have been a better idea to push back a bit and say "at least
>> > > clean it up a bit first". It's not like Intel is one of the clueless
>> > > companies that couldn't have done so and need help from the
>>community.

Well, it's been around for 10 years, and is pretty much the standard
filesystem
in HPC. While we are part of Intel now, there is still only a limited
number of
people working on it, and we don't have free reign to focus on getting it
into
the kernel. We still have customers to support and bugs to fix and
features to
develop for the next huge systems (1B cores writing 300TB/s to 1EB fs in
2018).
At the same time, there is enough demand in the
workgroup/department/university
scale that it makes sense to try and get it into mainline.

It isn't that we didn't want to get it into the kernel previously, but
-staging
didn't always exist and we don't have enough resources at one time to
rewrite
all of the code. Thanks to Peng Tao and EMC this is finally happening.
This
isn't "volunteer community" effort, there are dedicated resources working
on it.

>> > For this filesystem, it seems that they don't have any resources to do
>> > this work and are relying on the community to help out. Which is odd,
>> > but big companies are strange some times...
>>
>> Didn't we learn this lesson already with POHMELFS? i.e. that dumping
>> filesystem code in staging on the assumption "the community" will
>> fix it up when nobody in "the community" uses or can even test that
>> filesystem is a broken development model....
>
>They (Intel) has said that they will continue to clean up this code in
>the tree, until it is in good enough shape to be merged into fs/
>properly. If they ever stop helping out, I will end up dropping it from
>the tree, just like I did for pohmelfs, so don't worry about it
>lingering around abandoned.

Right, we are going to continue working on cleaning the code at a steady
pace
until it is ready to move to fs/. I don't expect Al or Dave or Christoph
to
spend their time (or make their eyes bleed) with the current state of the
code.
It has already undergone some significant cleanup, but needs a bunch more
still.

To be honest, I expect it will be in -staging for a year or so, but that is
fine with me since we've been working on it for 10+ years already and we
only
have so much capacity for changing/testing the code for the kernel while
keeping
all of the existing sites in working condition.

Cheers, Andreas
--
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/