Re: PROBLEM: Kernel BUG with raid5 soft + Xen + DRBD - invalid opcode

From: MasterPrenium
Date: Fri Dec 30 2016 - 19:00:53 EST


Thanks for your reply. DRBD isn't part of the kernel ? I was thinking it has been included since 2.6.3x ?

I've just tested without DRBD, the issue seems to remain. Can't see the "BUG", but the kernel crashed also. (A little bit later)
I don't have full dump since I lost my network connection and my serial connection.
Here is a picture of what I got :
Another one :

It also seems to me that having the "glances" monitoring software running in dom0, makes the kernel crashes quicker, don't think this can help but... just in case...

Any idea / test I can make ? This is really a blocking issue with potential data loss...

Best regards,

Le 30/12/2016 21:54, Jes Sorensen a écrit :
MasterPrenium<masterprenium.lkml@xxxxxxxxx> writes:
Hello Guys,

I've having some trouble on a new system I'm setting up. I'm getting a
kernel BUG message, seems to be related with the use of Xen (when I
boot the system _without_ Xen, I don't get any crash).
Here is configuration :
- 3x Hard Drives running on RAID 5 Software raid created by mdadm
- On top of it, DRBD for replication over another node (Active/passive cluster)
- On top of it, a BTRFS FileSystem with a few subvolumes
- On top of it, XEN VMs running.

The BUG is happening when I'm making "huge" I/O (20MB/s with a rsync
for example) on the RAID5 stack.
I've to reset system to make it work again.

Reproducible : ALWAYS (making the i/o, it crash in 2-5mins). Also
reproducible on another system with the same hardware.

Kernel versions impacted (at least): kernel-4.4.26, kernel-4.8.15, kernel-4.9.0
Well you have one foreign object in there that is not part of the
kernel and which shows up in the OOPS: DRDB

What happens when you remove that from the equation?