Re: (reiserfs) Re: LVM / Filesystems / High availability

Colin Plumb (colin@nyx.net)
Mon, 22 Jun 1998 18:26:30 -0600 (MDT)


Thanks to everyone for lots of good information. Can I ask for a little
bit more?

I have a partial understanding of how an LVM works, but I could use more.
If anyone knows any detailed descriptions of an implementation on the
web, I'd be interested.

Building a fake device out of bits of real devices is not that complicated.
The RAID code does this and the file system doesn't even need to know about
it.

The tricky part comes when you want to add or remove real devices, because
then your fake device changes size, and the file system needs to know
about *that*.

Adding blocks is fairly straightforward. You just add the blocks to
the free-block bit map and let the file system use them when it
needs them. (There are complexities like "where do I put the newly
enlarged bit map?" but I'll skip over them for mow.)

Shrinking a file system is more complicated, because there may be data
in the blocks being taken out of service, and this data needs to be copied
somewhere else, and all pointers to the data updated. This is very
similar to defragmenting a disk. If you hack your defragmenter to
consider the to-be-removed range of blocks to be in dire need of recopying
to free ranges elsewhere, and never available for being copied to,
then it can do the job for you.

Defragmenting like this while the file system is on line is tricky, but
can be done with help from the file system in making some changes atomic.

What gets tricky, to my mind, is removing blocks from the *middle* of
a virtual device. Say that you have three disks that are concatenated
to make a file system, of 2, 1 and 2 GB, respectively. You want to upgrade,
so you do a little bit of housecleaning to get the file system down to < 4 GB,
remove the 1 GB disk, and install a nice new 8G disk in its place.

We now have a virtual memory fragmentation problem. The file system
has (let us assume) a 32-bit address space of block numbers. These
are virtual addresses, which get mapped to physical addresses on
various devices. Each disk added and removed can be seen as an
allocation and free of virtual address space. We need to keep track of
this and the mappings to devices some how.

There is no particular reason why a device has to map to contiguous
blocks in the virtual address space, but to do otherwise will mess up
the file system's block allocation strategies unless they are extended
to understand the logical-to-physical mapping.

Of course, changing a virtual address for some data once it has been
allocated is *very* expensice and, in fact, fraught with race conditions,
so you don't want to have to do that.

I'm just wondering, how do existing implementations deal with this?

-- 
	-Colin

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu