Re: KASLR causes intermittent boot failures on some systems

From: Jeff Moyer
Date: Mon Apr 10 2017 - 15:18:21 EST


Kees Cook <keescook@xxxxxxxxxxxx> writes:

> On Mon, Apr 10, 2017 at 11:22 AM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
>> Kees Cook <keescook@xxxxxxxxxxxx> writes:
>>
>>> On Mon, Apr 10, 2017 at 8:49 AM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
>>>> Kees Cook <keescook@xxxxxxxxxxxx> writes:
>>>>
>>>>> On Fri, Apr 7, 2017 at 7:41 AM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> commit 021182e52fe01 ("x86/mm: Enable KASLR for physical mapping memory
>>>>>> regions") causes some of my systems with persistent memory (whether real
>>>>>> or emulated) to fail to boot with a couple of different crash
>>>>>> signatures. The first signature is a NMI watchdog lockup of all but 1
>>>>>> cpu, which causes much difficulty in extracting useful information from
>>>>>> the console. The second variant is an invalid paging request, listed
>>>>>> below.
>>>>>
>>>>> Just to rule out some of the stuff in the boot path, does booting with
>>>>> "nokaslr" solve this? (i.e. I want to figure out if this is from some
>>>>> of the rearrangements done that are exposed under that commit, or if
>>>>> it is genuinely the randomization that is killing the systems...)
>>>>
>>>> Adding "nokaslr" to the boot line does indeed make the problem go away.
>>>
>>> Are you booting with a memmap= flag?
>>
>> From my first email:
>>
>> [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.11.0-rc5+
>> root=/dev/mapper/rhel_intel--lizardhead--04-root ro memmap=192G!1024G
>> crashkernel=auto rd.lvm.lv=rhel_intel-lizardhead-04/root
>> rd.lvm.lv=rhel_intel-lizardhead-04/swap console=ttyS0,115200n81
>> LANG=en_US.UTF-8
>>
>> Did you not receive the attachments?
>
> I see it now, thanks!
>
> The memmap parsing was added in -rc1 (f28442497b5ca), so I'd expect
> that to be handled. Hmmm.

I can also reproduce this on a system with real persistent memory, which
does not require the memmap parameter.

Cheers,
Jeff