Re: [Qemu-devel] x86, nops settings result in kernel crash

From: Tomas Racek
Date: Fri Aug 17 2012 - 03:44:12 EST


----- Original Message -----
> Alan Cox <alan@xxxxxxxxxxxxxxxxxxx> writes:
>
> > On Thu, 16 Aug 2012 14:45:15 -0400 (EDT)
> > Tomas Racek <tracek@xxxxxxxxxx> wrote:
> >
> >> ----- Original Message -----
> >> > On Thu, Aug 16, 2012 at 09:35:12AM -0400, Tomas Racek wrote:
> >> > > Hi,
> >> > >
> >> > > I am writing a file system test which I execute in qemu with
> >> > > kernel
> >> > > compiled from latest git sources and running it causes this
> >> > > error:
> >> > >
> >> > > https://bugzilla.kernel.org/show_bug.cgi?id=45971
> >> > >
> >> > > It works with v3.5, so I ran git bisect which pointed me to:
> >> > >
> >> > > d6250a3f12edb3a86db9598ffeca3de8b4a219e9 x86, nops: Missing
> >> > > break
> >> > > resulting in incorrect selection on Intel
> >> > >
> >> > > To be quite honest, I don't understand this stuff much but I
> >> > > tried
> >> > > to do some debugging and I figured out (I hope) that the crash
> >> > > is
> >> > > caused by setting ideal_nops to p6_nops (k8_nops was used
> >> > > before
> >> > > the break statement was added).
> >> >
> >> > Maybe I overlooked it or maybe it was implied but did you try
> >> > reverting
> >> > the patch and rerunning your test? Does it work ok then?
> >> >
> >>
> >> Yes, if I remove the break statement (introduced by this commit),
> >> it works fine.
> >
> > What version of qemu is this - do we have qemu bug here I wonder.
>
> From the cpuinfo, it's 0.15.1. That's old but not ancient.

I've just upgraded my distribution so I tried qemu 1.0.1 which has the same behaviour as the former version.

>
> I took a brief look at the kernel code here. The default invocation
> of
> qemu presents an idealistic CPU with a very minimum feature bit set
> exposed. No processor has ever existed with this feature set.
>
> We do this in order to maintain compatibility when migration from
> Intel
> to AMD but also for legacy reasons.
>
> From the report, using '-cpu host' solves the problem. '-cpu host'
> exposes most of the host CPUID to the guest.

Well, I've added some debug statements to the code:

void __init arch_init_ideal_nops(void)
{
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_INTEL:
/*
* Due to a decoder implementation quirk, some
* specific Intel CPUs actually perform better with
* the "k8_nops" than with the SDM-recommended NOPs.
*/
if (boot_cpu_data.x86 == 6 &&
boot_cpu_data.x86_model >= 0x0f &&
boot_cpu_data.x86_model != 0x1c &&
boot_cpu_data.x86_model != 0x26 &&
boot_cpu_data.x86_model != 0x27 &&
boot_cpu_data.x86_model < 0x30) {
printk("NOPS: Option 1\n");
ideal_nops = k8_nops;
} else if (boot_cpu_has(X86_FEATURE_NOPL)) {
printk("NOPS: Option 2\n");
ideal_nops = p6_nops;
} else {
printk("NOPS: Option 3\n");
#ifdef CONFIG_X86_64
ideal_nops = k8_nops;
#else
ideal_nops = intel_nops;
#endif
}
break;
default:
#ifdef CONFIG_X86_64
ideal_nops = k8_nops;
#else
if (boot_cpu_has(X86_FEATURE_K8))
ideal_nops = k8_nops;
else if (boot_cpu_has(X86_FEATURE_K7))
ideal_nops = k7_nops;
else
ideal_nops = intel_nops;
#endif
}
}

This gives me Option 1 with "-cpu host" and Option 2 without.

> That said, QEMU really doesn't do anything differently depending on
> what
> feature bits are exposed to the guest. So my guess is that the odd
> combination of CPUID bits that are exposed to the guest is confusing
> the
> kernel.
>
> Can you post dmesg from the host kernel? Perhaps there's instruction
> emulation failing in the host KVM? That would manifest in strange
> behavior in the guest.

dmesg is in the attachment (qemu ran without "-cpu" argument). If I add "-cpu host" I get this:

[ 1046.112320] kvm: 5938: cpu0 unhandled rdmsr: 0x345
[ 1046.114998] kvm: 5938: cpu0 unhandled wrmsr: 0x680 data 0
[ 1046.115000] kvm: 5938: cpu0 unhandled wrmsr: 0x6c0 data 0
[ 1046.115002] kvm: 5938: cpu0 unhandled wrmsr: 0x681 data 0
[ 1046.115004] kvm: 5938: cpu0 unhandled wrmsr: 0x6c1 data 0
[ 1046.115005] kvm: 5938: cpu0 unhandled wrmsr: 0x682 data 0
[ 1046.115007] kvm: 5938: cpu0 unhandled wrmsr: 0x6c2 data 0
[ 1046.115009] kvm: 5938: cpu0 unhandled wrmsr: 0x683 data 0
[ 1046.115010] kvm: 5938: cpu0 unhandled wrmsr: 0x6c3 data 0
[ 1046.115012] kvm: 5938: cpu0 unhandled wrmsr: 0x684 data 0
[ 1046.115013] kvm: 5938: cpu0 unhandled wrmsr: 0x6c4 data 0


Regards,

Tomas

>
> Regards,
>
> Anthony Liguori
>
> >
> > Alan
>

Attachment: dmesg
Description: Binary data