Re: 4.8.2 not booting in 32-bit VM without I/O-APIC

From: Michal Necasek
Date: Fri Oct 28 2016 - 14:31:37 EST



Hi Thomas,

In case you haven't had a chance to take a look yet...

We had to dig a bit because the problem introduced by commit 2a51fe08 (arch/x86: Handle non enumerated CPU after physical hotplug) <1> is not fixed for us by commit ff856051 (arch/x86: Handle non enumerated CPU after physical hotplug) <2>.

To recap, after the initial commit, systems with no local APIC panicked <4> early during boot. That showed up for us in VirtualBox, but not surprisingly, physical systems are also affected <3>. The second patch fixes systems with no local APIC, but not systems which have no ACPI MADT (or no ACPI), no MP tables, yet do have an APIC.

The core problem is init ordering. In setup_arch() in arch/x86/kernel/setup.c, prefill_possible_map() is called *before* init_apic_mappings(). On typical modern systems, the local APIC will be set up either through ACPI or MP tables by the time prefill_possible_map() runs, but it is incorrect to assume that the APIC must be initialized by the time prefill_possible_map() is entered. That's why the APIC callbacks aren't no-ops there, they simply haven't been set up yet.

I suspect that either init_apic_mappings() needs to be called earlier or the initial fix from commit 2a51fe08 needs to be done later.


Regards,
Michal

<1>
https://patchwork.kernel.org/patch/9366095/
<2>
https://patchwork.kernel.org/patch/9390349/
<3>
https://bugs.archlinux.org/task/51506
<4>
Using APIC driver default
ACPI: PM-Timer IO Port: 0x4008
BUG: unable to handle kernel paging request at ffffc020
IP: [<c8045e0d>] native_apic_mem_read+0xd/0x10
*pde = 08b8a063
*pte = 00000000
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-040900rc1-generic
#201610151630
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox
12/01/2006
task: c89fda80 task.stack: c89f8000
EIP: 0060:[<c8045e0d>] EFLAGS: 00210046 CPU: 0
EIP is at native_apic_mem_read+0xd/0x10
EAX: ffffc020 EBX: ffffffff ECX: c89f9f40 EDX: fffff000
ESI: c8b8d000 EDI: c8b89400 EBP: c89f9f88 ESP: c89f9f84
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 80050033 CR2: ffffc020 CR3: 08b8c000 CR4: 00040690
Stack:
c8040eb6 c89f9fb8 c8accc5e c89f9fb8 c8b8d000 c8ac424a 33120000 00000000
35888000 00000000 00033120 00000000 c80ba5f7 00000000 00000000 00000000
00000000 00174f46 00174f46 0008f800 c8b8d800 08e34003 c8abe7f5 c88f4b62
Call Trace:
[<c8040eb6>] ? hard_smp_processor_id+0x16/0x30
[<c8accc5e>] ? prefill_possible_map+0x16/0x137
[<c8ac424a>] ? setup_arch+0xaf3/0xbdf
[<c80ba5f7>] ? vprintk_default+0x37/0x40
[<c8abe7f5>] ? start_kernel+0x8d/0x3d7
Code: a1 d8 89 b9 c8 5d c3 66 90 66 90 66 90 90 8b 0d b0 f5 a0 c8 8d 84
08 00 d0 ff ff 89 10 c3 8b 15 b0 f5 a0 c8 8d 84 10 00 d0 ff ff <8b> 00 c3
8b 15 20 94 9a c8 53 89 c3 b8 30 00 00 00 ff 52 78 3c
EIP: [<c8045e0d>]
native_apic_mem_read+0xd/0x10
SS:ESP 0068:c89f9f84
CR2: 00000000ffffc020
---[ end trace f68728a0d3053b52 ]---


----- Original Message -----
From: tglx@xxxxxxxxxxxxx
To: michal.necasek@xxxxxxxxxx
Cc: michael.thayer@xxxxxxxxxx, frank.mehnert@xxxxxxxxxx, knut.osmundsen@xxxxxxxxxx
Sent: Monday, October 24, 2016 9:39:45 PM GMT +01:00 Amsterdam / Berlin / Bern / Rome / Stockholm / Vienna
Subject: Re: 4.8.2 not booting in 32-bit VM without I/O-APIC

On Mon, 24 Oct 2016, Michal Necasek wrote:
>
> To explain a bit, disabling the I/O APIC also prevents the MP tables
> from being created in the VirtualBox VM (historical reasons) and there
> will likewise be no ACPI MADT.
>
> I believe the panic is triggered when neither ACPI nor MPS does any CPU
> discovery. Then the local APIC isn't mapped and prefill_possible_map()
> will page fault and panic because num_processors is zero and it just
> assumes that the local APIC is present and accessible.

> On systems with no MP tables, 'acpi=off' or 'nolapic' kernel arguments
> trigger the same panic. I didn't find a way to prevent Linux from looking
> at the MP tables if they're present.

Hmm. In both cases we should end up with apic == apic_noop() so any access
to the apic should not result in a panic. I'll have a look.

Thanks,

tglx