[regression?] 2.6.26 floppy boot failure with kernel packed using 'upx'

From: Frans Pop
Date: Thu Jul 10 2008 - 00:55:41 EST


For the Debian installer we've been tracing a problem with our installation
boot floppy. If booted with 'expert' at the syslinux prompt, it only shows:
<snip>
SYSLINUX 3.70 <etc.>
boot:expert
Loading linux......................
Loading initrd.gz....ready.
Probing EDD (edd=off to disable)... ok
</snip>

And then the emulator crashes (both VirtualBox and qemu), or on real
hardware the system reboots. The qemu crash is included at the bottom
of this mail.

The problem was initially seen with the 2.6.25 Debian kernel and traced
to a set of Xen-related patches backported from upstream 2.6.26. Floppies
using Debian 2.6.24 or pristine 2.6.24/2.6.25 don't show the problem.

Bisection has shown the culprit to be this very early 2.6.26 commit:
$ git bisect bad
099e1377269a47ed30a00ee131001988e5bcaa9c is first bad commit
commit 099e1377269a47ed30a00ee131001988e5bcaa9c
Author: Ian Campbell <ijc@xxxxxxxxxxxxxx>
Date: Wed Feb 13 20:54:58 2008 +0000

x86: use ELF format in compressed images.

Important factor here is that we "pack" the kernel using upx [1] (in order
to fit everything on a floppy). The original (unpacked) kernel after this
commit boots fine, only a packed version fails.
We have tried upx versions 2.01, 3.01 and 3.03, all with same result.

Both "good" (before commit) and "bad" (after commit) images are available
at: http://people.debian.org/~fjp/tmp/d-i/floppy/upx/
Included are the boot floppy image, the raw kernel and the packed kernel.

The issue can also be reproduced using qemu without booting the floppy
itself. For the "bad" image:
# Boots correctly (but fails when mounting root fs):
$ qemu -kernel vmlinuz -hda /dev/zero
# Fails:
$ qemu -kernel vmlinuz.upx -hda /dev/zero

So, the primairy question here is:
- is this a kernel regression because whatever changed is no longer valid
conform "kernel format specs", or
- is this a latent issue in upx that somehow creates an invalid image, or
- does this change effectively create a new "type" of image that upx
just doesn't yet know how to handle correctly?

And a follow-up question in the last two cases: how likely is it that this
change could/will cause similar issues in other comparable scenarios?

Note that we've been using this same compression technique for ages in
the Debian installer without any problems.

Cheers,
FJP

[1] http://upx.sourceforge.net/


upx command used for compression and output
-------------------------------------------
$ upx -f -9 vmlinuz
Ultimate Packer for eXecutables
Copyright (C) 1996,1997,1998,1999,2000,2001,2002,2003,2004,2005,2006,2007
UPX 3.01 Markus Oberhumer, Laszlo Molnar & John Reiser Jul 31st 2007

File size Ratio Format Name
-------------------- ------ ----------- -----------
1312304 -> 1245723 94.93% bvmlinuz/386 vmlinuz

qemu crash output
-----------------
$ qemu -fda boot.img
qemu: fatal: triple fault
EAX=00000018 EBX=00000000 ECX=00000000 EDX=003646f6
ESI=000333a1 EDI=00000000 EBP=ffffffb0 ESP=0003c367
EIP=00101015 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 00000000 00000000 00000000
CS =0010 00000000 ffffffff 00cf9b00
SS =0018 00000000 ffffffff 00cf9300
DS =0000 00000000 00000000 00000000
FS =0018 00000000 ffffffff 00cf9300
GS =0018 00000000 ffffffff 00cf9300
LDT=0000 00000000 00000000 00008000
TR =0020 00001000 00000067 00008900
GDT= 656e6900 0000646e
IDT= 00000000 00000000
CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000
CCS=ffeb7200 CCD=00000000 CCO=LOGICB
FCW=037f FSW=4000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
Aborted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/