Re: KASLR causes intermittent boot failures on some systems

From: Kees Cook
Date: Fri Apr 07 2017 - 17:25:48 EST


On Fri, Apr 7, 2017 at 7:41 AM, Jeff Moyer <jmoyer@xxxxxxxxxx> wrote:
> Hi,
>
> commit 021182e52fe01 ("x86/mm: Enable KASLR for physical mapping memory
> regions") causes some of my systems with persistent memory (whether real
> or emulated) to fail to boot with a couple of different crash
> signatures. The first signature is a NMI watchdog lockup of all but 1
> cpu, which causes much difficulty in extracting useful information from
> the console. The second variant is an invalid paging request, listed
> below.

Just to rule out some of the stuff in the boot path, does booting with
"nokaslr" solve this? (i.e. I want to figure out if this is from some
of the rearrangements done that are exposed under that commit, or if
it is genuinely the randomization that is killing the systems...)

> On some systems, I haven't hit this problem at all. Other systems
> experience a failed boot maybe 20-30% of the time. To reproduce it,
> configure some emulated pmem on your system. You can find directions
> for that here: https://nvdimm.wiki.kernel.org/
>
> Install ndctl (https://github.com/pmem/ndctl).
> Configure the namespace:
> # ndctl create-namespace -f -e namespace0.0 -m memory
>
> Then just reboot several times (5 should be enough), and hopefully
> you'll hit the issue.
>
> I've attached both my .config and the dmesg output from a successful
> boot at the end of this mail.

Thanks! Considering I know nothing about pmem (yet), I bet there is
some oversight in what's happening with how KASLR scans for available
memory areas. I'll carve out some time next week to look into this.

-Kees

--
Kees Cook
Pixel Security