Re: [Bug 100491] New: Oops under bitmap_start_sync [md_mod] at boot

From: Austin S Hemmelgarn
Date: Mon Jun 29 2015 - 08:28:41 EST


On 2015-06-28 16:53, Sami Liedes wrote:
On Thu, Jun 25, 2015 at 09:02:45PM +0000, bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote:
https://bugzilla.kernel.org/show_bug.cgi?id=100491

Bug ID: 100491
Summary: Oops under bitmap_start_sync [md_mod] at boot
[...]
Reading all physical valumes. This may take a while...
Found volume group "rootvg" using metadata type lvm2
device-mapper: raid: Device 0 specified for rebuild: Clearing superblock
md/raid1:mdX: active with 1 out of 2 mirrors
mdX: invalid bitmap file superblock: bad magic
md-cluster module not found.
mdX: Could not setup cluster service (256)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000100
IP: [<ffffffff8159e4a9>] _raw_spin_lock_irq+0x29/0x70
PGD 0
Oops: 0002 [#1] PREEMPT SMP
[...]

I'm marking this as a regression in bugzilla, since this seems to
prevent booting on 4.1.0 at least in certain circumstances (namely
those which I have; I wonder if any raid1 recovery works?) while 4.0.6
boots correctly.
I can confirm having the same issue with the MD code being used through dm-raid.

I bisected this down to one of four commits. Well, assuming that the
problem was caused by changes in drivers/md; a fair assumption, I
think. The commits are:

$ git bisect view --oneline
f9209a3 bitmap_create returns bitmap pointer
96ae923 Gather on-going resync information of other nodes
54519c5 Lock bitmap while joining the cluster
b97e9257 Use separate bitmaps for each nodes in the cluster
My own bisect turned up the same set of commits, although I wouldn't have the time to go any further with it until next weekend.

The crash happens whether or not CONFIG_MD_CLUSTER is enabled.
Again, same here.


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature