Re: [dm-devel] can't recover ext4 on lvm from ext4_mb_generate_buddy:739:group 1687, 32254 clusters in bitmap, 32258 in gd

From: Mikulas Patocka
Date: Fri Jan 06 2012 - 11:40:25 EST




On Thu, 5 Jan 2012, Ted Ts'o wrote:

> On Thu, Jan 05, 2012 at 05:14:28PM +0100, Sander Eikelenboom wrote:
> >
> > OK spoke too soon, i have been able to trigger it again:
> > - copying files from LV to the same LV without the snapshot went OK
> > - copying from the RO snapshot of a LV to the same LV gave the error while copying the file again:
>
> OK. Originally, you said you did this:
>
> 1) fsck -v -p -f the filesystem
> 2) mount the filesystem
> 3) Try to copy a file
> 4) filesystem will be mounted RO on error (see below)
> 5) fsck again, journal will be recovered, no other errors
> 6) start at 1)
>
> Was this with with a read-only snapshot always being in existence
> through all of these five steps? When was the RO snapshot created?
>
> If a RO snapshot has to be there in order for this to happen, then
> this is almost certainly a device-mapper regression. (dm-devel folks,

The existence of a snapshot changes I/O completion times significantly, so
it may be a race condition in ext4 that gets triggered which changed
timings.

Mikulas

> this is a problem which apparently occurred when the user went from
> v3.1.5 to v3.2, so this looks likes 3.2 regression.)
>
> - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/