Re: Notes on support for multiple devices for a single filesystem

From: Andrew Morton
Date: Wed Dec 17 2008 - 14:53:59 EST


On Wed, 17 Dec 2008 08:23:44 -0500
Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:

> FYI: here's a little writeup I did this summer on support for
> filesystems spanning multiple block devices:
>
>
> --
>
> === Notes on support for multiple devices for a single filesystem ===
>
> == Intro ==
>
> Btrfs (and an experimental XFS version) can support multiple underlying block
> devices for a single filesystem instances in a generalized and flexible way.
>
> Unlike the support for external log devices in ext3, jfs, reiserfs, XFS, and
> the special real-time device in XFS all data and metadata may be spread over a
> potentially large number of block devices, and not just one (or two)
>
>
> == Requirements ==
>
> We want a scheme to support these complex filesystem topologies in way
> that is
>
> a) easy to setup and non-fragile for the users
> b) scalable to a large number of disks in the system
> c) recoverable without requiring user space running first
> d) generic enough to work for multiple filesystems or other consumers
>
> Requirement a) means that a multiple-device filesystem should be mountable
> by a simple fstab entry (UUID/LABEL or some other cookie) which continues
> to work when the filesystem topology changes.

"device topology"?

> Requirement b) implies we must not do a scan over all available block devices
> in large systems, but use an event-based callout on detection of new block
> devices.
>
> Requirement c) means there must be some version to add devices to a filesystem
> by kernel command lines, even if this is not the default way, and might require
> additional knowledge from the user / system administrator.
>
> Requirement d) means that we should not implement this mechanism inside a
> single filesystem.
>

One thing I've never seen comprehensively addressed is: why do this in
the filesystem at all? Why not let MD take care of all this and
present a single block device to the fs layer?

Lots of filesystems are violating this, and I'm sure the reasons for
this are good, but this document seems like a suitable place in which to
briefly decribe those reasons.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/