Re: 3.12 to 3.13 boot regression bisected - still applies to 3.16

From: Bruno PrÃmont
Date: Tue Aug 05 2014 - 05:13:41 EST


On Tue, 5 Aug 2014 09:45:42 +0100 Matt Fleming wrote:
> On Tue, 05 Aug, at 10:02:42AM, Bruno PrÃmont wrote:
> >
> > I tried in setup_arch(), but system still keeps rebooting.
> >
> > Working backwards I got to x86_64_start_kernel() in
> > arch/x86/kernel/head64.c but system is still rebooting.
>
> Thanks for doing this. I'm sure it was a major PITA ;-)

Fortunately building the kernels on a separate system and being
able to build a dozen of them to try from efi shell it's survivable.

> > Not sure what happens before x86_64_start_kernel() is called, it seems
> > to be called from ASM code in arch/x86/kernel/head_64.S.
>
> Yep. Roughly the code flow goes like this (chronologically),
>
> efi_pe_entry() [arch/x86/boot/compressed/head_64.S]
> efi_main() [arch/x86/boot/compressed/eboot.c]

I get at least to just before
status = efi_call_early(exit_boot_services, handle, key);
in eboot.c on line 1310. A efi_printk inserted there is displayed.

> startup_64 [arch/x86/kernel/head_64.S]
> secondary_startup64 [arch/x86/kernel/head_64.S]
> x86_64_start_kernel() [arch/x86/kernel/head64.c]
>
> > > Meanwhile I'm going to go and stare at the EFI boot stub code and
> > > instrument OVMF to check for more memory corruption bugs like the one
> > > Michael found in commit c7fb93ec51d4 ("x86/efi: Include a .bss section
> > > within the PE/COFF headers").
> >
> > If there are places between exit_boot() in
> > arch/x86/boot/compressed/eboot.c and x86_64_start_kernel() where I
> > should include such loops, please tell!
>
> I guess we need to verify efi_main() actually exits correctly. So a
> while (1); loop at the end of that function would be useful.
>
> Assuming that does actually hang, you get the fun of rummaging around in
> the early assembly code, where you can use something like this,
>
> bruno:
> hlt
> jmp bruno
>
> to try and force a hang.

Will spin a few attempts and see what I get.

> Could you also attach your .config? In particular I'm wondering whether
> you've got CONFIG_RELOCATBLE enabled.

Config attached (gzipped). CONFIG_RELOCATBLE is not enabled.

Bruno

Attachment: config.gz
Description: application/gzip