RE: Early kernel panic in dmi_decode when running 32-bit kernel on Hyper-V on Windows 11

From: Michael Kelley
Date: Tue Apr 16 2024 - 19:20:47 EST


From: Michael Schierl <schierlm@xxxxxx> Sent: Tuesday, April 16, 2024 2:24 PM
>
> Am 16.04.2024 um 01:31 schrieb Michael Kelley:
>
> > Can you give me details of the Hyper-V VM configuration? Maybe
> > a screenshot of the Hyper-V Manager "Settings" for the VM would
> > be a good starting point, though some of the details are on
> > sub-panels in the UI.
>
> It used to be possible to export Hyper-V VM settings as XML, but
> apparently that option has been removed in Win2016/Win10, in favor of
> their own proprietary binary .vmcx format...
>
> Also, maybe it matters what else Hyper-V is doing. I've installed both
> WSL and WSA, and Windows Defender is using Core Isolation Memory
> Integrity. I have also enabled support for nested virtualisation in the
> Host/Network Switch, but not in that VM.
>
> Anyway, I just created two new VMs (one of each generation) with no hard
> disk and everything else default, added a DVD drive to the SCSI
> controller of Gen2 (which Gen1 already had on its IDE controller),
> disabled Secure Boot on Gen2 and added a second vCPU to Gen1 (which Gen2
> already had).
>
> Afterwards, Gen2's dmidecode looks like the summary you posted, and Gen1
> reproduces the issue.
>
> > I'm guessing your 32-bit Linux VM is
> > a Generation 1 VM. FWIW, my example was a Generation 2 VM.
>
> Very interesting that Gen2 boots 32-bit Linux better than Gen1 (there is
> a delay during hardware autoconfigruation (systemd-udevd) for about 30
> seconds when booting Gen2 which I did not investigate yet), despite the
> documentation claiming not to use Gen2 for any 32-bit Host OSes.
>
> So I assume this only applies to crappy OSes that directly couple their
> bitness to the bitness of the UEFI firmware.
>
> To be fair, the live media I'm using uses Grub's "non-compliant" Linux
> loader that bypasses the kernel's EFI stub. When trying with Grub's
> "linuxefi" loader, Linux does not boot either, as expected. (On the Gen1
> VM, the panic happens regardless whether I use grub's linux16 or linux
> loader, and also with SYSLINUX/ISOLINUX loader).
>
> > When you ran a 64-bit Linux and did not have the problem, was
> > that with exactly the same Hyper-V VM configuration, or a different
> > config?
>
> All my tests were performed with a single (Gen1) VM, and the only
> setting I changed was the number of vCPUs.
>

Thanks for the information. I now have a repro of "dmidecode"
in user space complaining about a zero length entry, when running
in a Gen 1 VM with a 64-bit Linux guest. Looking at
/sys/firmware/dmi/tables/DMI, that section of the DMI blob definitely
seems messed up. The handle is 0x0005, which is the next handle in
sequence, but the length and type of the entry are zero. This is a bit
different from the type 10 entry that you saw the 32-bit kernel
choking on, and I don't have an explanation for that. After this
bogus entry, there are a few bytes I don't recognize, then about
100 bytes of zeros, which also seems weird.

But at this point, it's good that I have a repro. It has been a while since
I've built and run a 32-bit kernel, but I think I can get that set up with
the ability to get output during early boot. I'll do some further
debugging with dmidecode and with the 32-bit kernel to figure out
what's going on. There are several mysteries here: 1) Is Hyper-V
really building a bad DMI blob, or is something else trashing it?
2) Why does a 64-bit kernel succeed on the putative bad DMI blob,
while a 32-bit kernel fails? 3) Is dmidecode seeing something
different from the Linux kernel?

Give me a few days to sort all this out. And if Linux can be made
more robust in the face of a bad DMI table entry, I'll submit a
Linux kernel patch for that.

Michael Kelley