Re: 2.6.{26.2,27-rc} oops on virtualbox

From: Gerhard Brauer
Date: Sun Aug 31 2008 - 05:29:36 EST


On Thu, Aug 28, 2008 at 10:30:13AM -0300, Luiz Fernando N. Capitulino wrote:
> Em Wed, 27 Aug 2008 19:33:28 -0400
> Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> escreveu:
> |
> | Since this problem appears while we are using a simple memcpy (the
> | text_poke_early version), but disappears when we disable interrupts for
> | a longer period of this, I suspect a problem with irq disabling in
> | Virtualbox.
> |
> | We could try to add some nsleep() or msleep() calls within text_poke and
> | text_poke_early before and after the code modificatoin to see if the
> | problem disappears. If it does, then that would somewhat confirm the
> | racy irq disable thesis.
>
> Well, a Ubuntu kernel guy has reported in the virtualbox's ticket[1]
> that the oops doesn't happen if he puts a printk() in the crash site.
>
> The funny thing is that someone (who might be a virtualbox developer)
> used the same race argument to say that this is a bug in the kernel.
>
> What concerns me though is that how can virtualbox be worth using
> in the Linux community if it's probably not working for various distros
> (currently Fedora, Ubuntu, Mandriva and ArchLinux).
>
> Thanks for the effort, guys.
>
> [1] http://www.virtualbox.org/ticket/1875

Ok, some news from archlinux side:
Our distribution kernel was upgraded from 2.6.26.2 to 2.6.26.3. With
this upgrade to patchlevel .3 the "early oops"(freeing smp...) has gone.
My virtual machines boots always fine with this, and i have one
confirmation from a user about this.

Kernel upgrade does not solve the kernel panic during work with the VM,
when there is heavy disk IO. I test and could reproduce this by untar 2
big files in seperate dirs: bsdtar -x -f VirtualBox-1.6.2-OSE.tar.bz2.
Doing this simultan crashed the VM always.
SreenShot:
http://users.archlinux.de/~gerbra/tmp/2008-08-31-110449_724x456_scrot.png

This heavy IO oops does not occur under 2.6.26.2 when using the
"3-changes-patch" against alternatives.c, which we have tested in the
other mails. There must be something irq related which fix this
3-changes-patch, and what was not fixed in 2.6.26.3
On the other hand: I never have stressed a VM like this before
researching for this problem. So it could also be that the heavy-IO
problem way a total seperate problem from that we're talking about here.
Doing my "normal" work now in VM (it's my devel VM for compiling and
testing), until now i don't have had this IO oops.

We use a mostly unpatched kernel as distribution kernel.

So short summary from my side:
a) With "3-changes-patch" i got a rock solide VM
b) 2.6.26.2 have the early oops on boot and IO oops when sometimes
bootet.
c) 2.6.26.3 have only the heavy-IO oops

I'll try a fresh VM, where i will test:
a) Using sata controller emulation as bus (now i have ide(piix3))
b) Using different filesystems (With 2.6.26.2 early oops and heavy-io
oops could be reproduced with any filesystem).


Regards
Gerhard

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/