Re: idea for filesystem to LVM interface

Adam D. Bradley (artdodge@cs.bu.edu)
Fri, 26 Jun 1998 20:01:07 -0400 (EDT)


Matt,

> I'm new to kernel hacking, so please, get harsh on this idea before I
> waste any more time on it.

I love candid introductions ;-)

> The ideas discussed so far for growable/shrinkable filesystem seem to
> me to be modeled on the process to kernel brk/sbrk interface. Why not
> implement an LVM as something that doles out and takes back relatively
> small (4mb to 128mb) chunks of disk space which are individually
> linearly addressable. In other words, model the interface more on
> malloc/free as opposed to brk/sbrk.

"If your filesystem doesn't need a contiguous logical volume, your
filesystem doesn't need an LVM."

To be more verbose (if not more precise): what you suggest is simply a
different abstraction than what an LVM, by definition, provides. Your
model could very easily (well, maybe ;->) be implemented by treating
available partitions/raw disks/md devices/etc as the allocatable
extents, or by using a simple "extent manager" to dole out chunks from
an available pool spanning multiple devices. So call it something
besides an LVM... a "Logical Extent Manager"?

> In one version of this approach the intra-kernel interface would look
> like this...
>
> chunk_t *chunk_alloc( lvolume_t *lvolume, long blcks );
> int chunk_write( chunk_t *chunk, long blck, const char *buf );
> int chunk_read( chunk_t *chunk, int blck, char *buf );
> int chunk_free( lvolume_t *lvolume, chunk_t *chunk );

I have three concern about this kind of interface.

1) It presumes that there is some agent in the filesystem kernel code
which decides when new extents are needed. I would rather see this
policy decision reside in user space, from where it could make two
separate syscalls: first to prepare the underlying volume manager to
have a new extent used by the filesystem, and a second one to the
filesystem telling it where to find a new extent to use.

2) It is unidirectional; but this, again, only a concern if we presume
that there are kernel-space agents making policy decisions.

3) The "chunk_t *" needs to be reduced to a data type that can appear
on-disk in things like directories and indirection blocks.

I think it would be much simpler to have a single, linearly
addressible logical device available, of which only sparse extents may
actually be available. So the only new interfaces needed would be:

unsigned long allocate_extent(kdev_t logicaldevice, unsigned long blocks);
(returns first logical block number in new extent)

int free_extent(kdev_t logicaldevice, unsigned long base);

That way the filesystem can just use bread(logicaldevice, blockno) as
it always has, allowing the logical device (i.e. the extent manager)
to map from blockno to "real" (kdev_t,block) touples.

This scheme also eliminates #3 above as a concern.

One problem worth noting: you need to be sure the logical-to-physical
mapping will persist (or be easily reconstructible) across system
failures.

I'm in the midst of designing a filesystem (XLFS) that can take
advantage of multiple volumes without requiring the use of an
intervening LVM layer (although it can take advantage of one). It's
inspired largely by the features of AdvFS and the architecture of the
BSD LFS (Log-structured). The LFS design makes it almost trivial to
add and remove volumes from a filesystem, to share a volume set
between multiple mountable "filesets", and to take advantage of
resizable logical volumes (contiguous is truly trivial;
non-contiguous requires a little more work, but that's a non-existent
interface, while a Linux LVM exists now). I'm working on the
whitepaper/technical specification now; expect a note to hit the list
in the near future with a URL.

> ps- I'm only on the linux-kernel-digest, so I would appreciate a cc: to
> mgrosso@acm.org. Thanks

Will do.

> pps- Is there really a linux filesystem development mailing list? Can
> anyone direct me to that? web page?

linux-fsdevel@vger.rutgers.edu, it was announced on linux-kernel just
a day or two ago. To subscribe send a message with
"subscribe linux-fsdevel"
to majordomo@vger.rutgers.edu

Adam

--
You crucify all honesty             \\Adam D. Bradley  artdodge@cs.bu.edu
No signs you see do you believe      \\Boston University Computer Science
And all your words just twist and turn\\    Grad Student and Linux Hacker
Reviving just to crash and burn        \\                             <><
--------->   Why can't you listen as love screams everywhere?   <--------

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu