metadata plugins (was Re: the " 'official' point of view" expressedby kernelnewbies.org regarding reiser4 inclusion)

From: Jeff Garzik
Date: Fri Jul 28 2006 - 07:13:10 EST


Hans Reiser wrote:
Jeff Garzik wrote:
Plugins --do not-- mean that you can just change the filesystem format
willy-nilly, with zero impact.

Yes they do.....

Take off your marketing cap for a second :)

Plugins do not eliminate the going-back-to-older-kernels issue, as discussed at length in the recent ext4 thread. Plugins complicate, rather than simplify, the issue of determining the compatibility between the kernel's metadata handling code and on-disk metadata.

Quite simply, there is always a compatibility impact when the filesystem format changes. Enterprise customers will inevitably run several -generations- of your software, and they are very aware of migration costs when dealing with hardware+software life cycles in the 5-10 year range. Power users and kernel developers bounce between kernel versions all the time.

This is the same problem I described in the ext4 thread, then summarized as "what ext3 am I talking to, today?" You can easily rewrite that as "what reiser4 set of plugins am I talking to, today?"

Plugins == a ton of new software to be versioned and tracked. Rather than just track "reiser4", you must now track the collection of plugin versions as well. OBVIOUS increased complexity, and OBVIOUS additional admin learning curve for the serious IT shop.


Now prepare for the bombshell revelation...


THAT SAID, I do think fs {algorithm|metadata} plugins are a very interesting idea.

The plugin infrastructure is the logical result of trying to support multiple incompatible {inode, dir, file data, ...} object types in the same Linux filesystem. Looking at in-tree Linux filesystems, one can see common design patterns in filesystems that hand-code support for multiple metadata format versions -- Al Viro's fs/sysv hacks being a simple and elegant example.

It is then simple to follow that train of logic: why not make it easy to replace the directory algorithm [and associated metadata]? or the file data space management algorithms? or even the inode handling? why not allow customers to replace a stock algorithm with an exotic, site-specific one?

In essence, reiser4 is not really a filesystem, but a second VFS, with a few stock metadata modules.

Furthermore, it completely changes the notion of what a Linux filesystem is. Currently, each Linux filesystem is a tightly constrained set of metadata support. reiser4 changes "tightly constrained" to "infinity". While that freedom is certainly liberating, it also has obvious support costs due to new admin paradigms and customer configuration possibilities.

I don't want to be the distro support person trying to fix a crash in "reiser4", where the customer has secretly replaced the standard inode data structure with a plugin written by an intern, and secretly replaced the directory algorithm with a closed source plugin from PickYourVendor. Trying picking through that mess with a filesystem debugger.

The reiser4 plugin system is far better implemented as the VFS level, as an optional library, like fs/libfs.c. Such a library would contain factored-out infrastructure common to all filesystems that support multiple versions of the same object type, thereby shrinking the "real" filesystem code. Then add a tiny bit of code that allows addition and removal of object types within each filesystem.

At that point, you have reduced a Linux filesystem to binding between a common name ("reiser4", "ext3") and a set of cooperating modules.

A very interesting train of thought. But not one that belongs in the kernel in its current form. If you hide a second VFS inside fs/reiser4, I GUARANTEE that you will get the locking and multi-threading wrong. I guarantee your code will be under-reviewed, and NAK'd. You lose all the testing that would be gained by others travelling your code paths, if the code was implemented generically in fs/libplugin.c rather than fs/reiser4/*

To move forward towards the goal of merging reiser4 into the kernel, you must recognize two distinctly SEPARATE ISSUES, and deal with them separately:

1) merging the latest whiz-bang collection of filesystem algorithms, under the filesystem name "reiser4"
2) adding the plugin concept to the Linux VFS

I don't know if #2 will pass consensus, but I certainly think we should at least experiment in that direction. Your guys definitely have a lot of good ideas, and I'm glad Linux is seeing a fresh injection of new ideas.

I just don't think its a good idea to merge a filesystem called "foo" that can be completely redefined in the field. Or least, not without thinking a lot more on the impact. Its a VFS _and_ a filesystem, and should be evaluated as such.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/