Re: (reiserfs) Re: LVM / Filesystems / High availability

Florian Lohoff (flo@quit.mediaways.net)
Thu, 25 Jun 1998 11:54:51 +0200


On Thu, Jun 25, 1998 at 01:19:23AM +0200, Erik Corry wrote:
> In article <199806231954.PAA24101@dcl.MIT.EDU> you wrote:
> > Date: Tue, 23 Jun 1998 18:40:06 +0200
> > From: Florian Lohoff <flo@quit.mediaways.net>
>
> > The LVM approach with the "virtual block device" makes many things
> > much easier. You can keep filesystem code very simple, and the LVM
> > code also isnt very complex. The only thing you might take care on is
> > the Block Allocation of the LVM which you might do as complex and
> > intelligent as you like but a bug in there will NOT cause data to get
> > lost or corrupt.
>
> > I disagree; the block allocation issue gets very complex if you try and
> > treat the LVM as a virual block device with "holes" in the device. It's
> > much, much simpler if the filesystem is intimately aware of each logical
> > volume, and knows what size it is.
>
> I think you are misunderstanding the way the LVM works. If
> it is anything like the AIX LVM, then you split all disks
> up into 4MB chunks and allocate them to the filesystems
> any way you please. That's a lot of little chunks, do
> you really want to handle them in the filesystem?
>
> When you shrink a filesystem under this system you
> always shrink at the end. If the 4MB chunks you want
> to free weren't at the end, then you move them off the
> disks they are on, and onto a freed up 4MB chunk. This
> requires some cleverness in the LVM to do online, but it
> doesn't seem impossible (redirect writes to new device,
> keep a bitmap of which blocks are written on new device,
> move old blocks over, don't overwrite new blocks, during
> the move, reads consults the bitmap and reads from the
> right side).

This is already done and working as i understand Heinz.

> This means the fs doesn't need to know where the underlying
> block devices stop and start.
>
> Advantages:
> 1) One level of block virtualisation, not two (mirror/stripe in
> block device, concatenation in fs)
> 2) Works for other filesystems (though many of them are for
> compatibility with other OSs anyway, and since the other OS
> doesn't understand the LVM it won't work).

Think of not only having ext2 for you system disk but also
a Log-Structured filesystem for the news-base. Corruption
of the News base isnt that bad so mirror your system disk
and stripe your news-base for performance and all in 1
Volume Group on x disks.

> 3) Nice small chunks so you can divide up between partitions
> in a very flexible way. This lets us have /var and /home
> on different partitions, but still allows us to change their
> relative sizes without repartitioning.
> 4) You don't need to complicate the mount interface to specify
> several partitions. It's still just one block device.
> 5) Filesystem still works with simple fast contiguous block
> numbers internally.

Ted mentioned a case where there is a problem with the LVM with
taking PVs out of service if there are no free PEs but
free space in the filesystem.
My approach was to shrink the filesystem by the size of the
failing PV to take out-of-service and then replacing the
PE on the PV through PEs on other PVs but this moves blocks
to the failing disk which we would like to take out-of-service.

difficult ... my solution was to attach a new drive and copy directly
from the failing to the new ...

This is only one difficulty instead of a few hundred in the ext2 :)))

Flo

-- 
Florian.Lohoff@mediaWays.net			+49-5241-80-7085
aka flo@mini.gt.owl.de			@HOME	+49-5241-470566

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu