Re: kvm causing memory corruption? now 2.6.26-rc4

From: Avi Kivity
Date: Wed Jun 04 2008 - 09:43:00 EST


Dave Hansen wrote:
On Thu, 2008-03-27 at 16:59 +0200, Avi Kivity wrote:
Dave Hansen wrote:
On Thu, 2008-03-27 at 12:10 +0200, Avi Kivity wrote:
btw, is this with >= 4GB RAM on the host?
Well, are you asking whether I have PAE on or not? :)
No, I'm asking whether there is a possibility of address truncation :)

PAE by itself doesn't affect kvm much, as it always runs the guest in pae mode.

Can you try running with mem=2000M or something?

I have a few more data points on this. Sorry for the massive delay from
the last report -- I'm being a crappy bug reporter. But, this is on my
one and only laptop which makes it a serious pain to diagnose. I also
didn't have a hardware serial console on it before, which I do now.
This is all on 2.6.26-rc4-01549-g1beee8d.

Adding the mem= does not help at all. But, it is all a bit more
diagnosable now than a month or two ago. I turned on all of the kernel
debugging that I could get my grubby little hands on. It now oopses
quite consistently when kvm runs instead of after. Here's a collection
of oopses that I captured after setting up a serial line:

http://sr71.net/~dave/kvm-oops1.txt

After collecting all those, I turned on CONFIG_DEBUG_HIGHMEM and the
oopses miraculously stopped. But, the guest hung (for at least 5
minutes or so) during windows bootup, pegging my host CPU. Most of the
CPU was going to klogd, so I checked dmesg.


Can you check with mem=900 (and CONFIG_HIGHMEM_DEBUG=n)? That will confirm that the problems are highmem related, but not physical address truncation related.

I was seeing messages like this

[ 428.918108] kvm_handle_exit: unexpected, valid vectoring info and exit reason is 0x9

And quite a few of them, like 100,000/sec. That's why klogd was pegging
the CPU. Any idea on a next debugging step?


That's a task switch. Newer kvms handle them.


--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/