Re: [PATCH] x86/boot: Fix boot failure when SMP MP-table is based at 0

From: Tomeu Vizoso
Date: Thu Nov 16 2017 - 04:17:13 EST


Adding regression@xxxxxxxxxxxxx to CC so this regression is tracked.

Regards,

Tomeu

On 8 November 2017 at 09:37, Tomeu Vizoso <tomeu@xxxxxxxxxxxxxxx> wrote:
> On 6 November 2017 at 23:01, Tom Lendacky <thomas.lendacky@xxxxxxx> wrote:
>> On 11/6/2017 3:41 PM, H. Peter Anvin wrote:
>>>
>>> On 11/06/17 12:17, Tom Lendacky wrote:
>>>>
>>>> When crosvm is used to boot a kernel as a VM, the SMP MP-table is found
>>>> at physical address 0x0. This causes mpf_base to be set to 0 and a
>>>> subsequent "if (!mpf_base)" check in default_get_smp_config() results in
>>>> the MP-table not being parsed. Further into the boot this results in an
>>>> oops when attempting a read_apic_id().
>>>>
>>>> Add a boolean variable that is set to true when the MP-table is found.
>>>> Use this variable for testing if the MP-table was found so that even a
>>>> value of 0 for mpf_base will result in continued parsing of the MP-table.
>>>>
>>>> Reported-by: Tomeu Vizoso <tomeu@xxxxxxxxxxxxxxx>
>>>> Signed-off-by: Tom Lendacky <thomas.lendacky@xxxxxxx>
>>>
>>>
>>> Ahem... did anyone ever tell you that this is an epicly bad idea on your
>>> part? The low megabyte of physical memory has very special meaning on
>>> x86, and deviating from the standard use of this memory is a *very*
>>> dangerous thing to do, and imposing on the kernel a "fake null pointer"
>>> requirement that exists only for the convenience of your particular
>>> brokenness is not okay.
>>>
>>> -hpa
>>
>>
>> That was my initial thought... what was something doing down at the start
>> of memory. But when I looked at default_find_smp_config() it specifically
>> scans the bottom 1K for a an MP-table signature. I was hoping to get some
>> feedback as to whether this would really be an acceptable thing to do. So
>> I'm good with this patch being rejected, but the change I made in
>>
>> 5997efb96756 ("x86/boot: Use memremap() to map the MPF and MPC data")
>>
>> does break something that was working before.
>
> Do I understand correctly that the best we can do right now is
> reverting 5997efb96756?
>
> Thanks,
>
> Tomeu