Re: [PATCH] x86: fix kaslr and memmap collision
From: Ross Zwisler
Date: Tue Jan 03 2017 - 11:28:26 EST
On Tue, Jan 03, 2017 at 04:31:37PM +0800, Baoquan He wrote:
> Hi Dan,
>
> On 11/22/16 at 09:26am, Dan Williams wrote:
> > [ replying for Dave since he's offline today and tomorrow ]
> >
> > On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> > >
> > > * Dave Jiang <dave.jiang@xxxxxxxxx> wrote:
> > >
> > >> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
> > >> However it does not take into account the memmap= parameter passed in from
> > >> the kernel commandline.
> > >
> > > memmap= parameters are often used as a list.
> > >
> > >> [...] This results in the kernel sometimes being put in the middle of the user
> > >> memmap. [...]
> > >
> > > What does this mean? If memmap= is used to re-define the memory map then the
> > > kernel getting in the middle of a RAM area is what we want, isn't it? What we
> > > don't want is for the kernel to get into reserved areas, right?
> >
> > Right, this is about teaching kaslr to not land the kernel in newly
> > defined reserved regions that were not marked reserved in the initial
> > e820 map from platform firmware.
>
> If only tell kaslr to not land kernel in newly defined reserved regions,
> memory added by "memmap=nn[KMG]@ss[KMG]" should not be avoided since
> it's usable memory. Kernel randomized into this region is also what we
> want. Not sure if I understand it right.
The following text is from:
https://nvdimm.wiki.kernel.org/how_to_choose_the_correct_memmap_kernel_parameter_for_pmem_on_your_system
Hopefully this will make it clearer.
---
Another thing that you may need to be aware of is the CONFIG_RANDOMIZE_BASE
kernel config option. When enabled, this randomizes the physical address at
which the kernel image is decompressed and the virtual address where the kernel
image is mapped. Currently this random address is chosen without regard to the
memmap kernel command line parameter.
This means that the kernel can choose to put itself in the middle of your
reserved memmap area. You can observe this behavior via /proc/iomem.
Here is /proc/iomem from a system with CONFIG_RANDOMIZE_BASE turned off:
# cat /proc/iomem
00000000-00000fff : reserved
00001000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c97ff : Video ROM
000c9800-000ca5ff : Adapter ROM
000ca800-000ccbff : Adapter ROM
000f0000-000fffff : reserved
000f0000-000fffff : System ROM
00100000-bffd8fff : System RAM
01000000-01b18598 : Kernel code
01b18599-023f53ff : Kernel data
0276d000-0365efff : Kernel bss
bffd9000-bfffffff : reserved
c0000000-febfffff : PCI Bus 0000:00
f4000000-f7ffffff : 0000:00:02.0
f8000000-fbffffff : 0000:00:02.0
fc000000-fc03ffff : 0000:00:03.0
fc050000-fc051fff : 0000:00:02.0
fc052000-fc052fff : 0000:00:03.0
fc053000-fc053fff : 0000:00:04.0
fc054000-fc054fff : 0000:00:05.7
fc054000-fc054fff : ehci_hcd
fc055000-fc055fff : 0000:00:06.0
fec00000-fec003ff : IOAPIC 0
fee00000-fee00fff : Local APIC
feffc000-feffffff : reserved
fffc0000-ffffffff : reserved
100000000-4ffffffff : Persistent Memory (legacy)
100000000-4ffffffff : namespace0.0
500000000-53fffffff : System RAM
The interesting bits for us are the âSystem RAMâ region from 00100000-bffd8fff,
and the âPersistent Memory (legacy)â region from 100000000-4ffffffff.
If I turn on CONFIG_RANDOMIZE_BASE on this same system, I get the following:
# cat /proc/iomem
00000000-00000fff : reserved
00001000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c97ff : Video ROM
000c9800-000ca5ff : Adapter ROM
000ca800-000ccbff : Adapter ROM
000f0000-000fffff : reserved
000f0000-000fffff : System ROM
00100000-bffd8fff : System RAM
bffd9000-bfffffff : reserved
c0000000-febfffff : PCI Bus 0000:00
f4000000-f7ffffff : 0000:00:02.0
f8000000-fbffffff : 0000:00:02.0
fc000000-fc03ffff : 0000:00:03.0
fc050000-fc051fff : 0000:00:02.0
fc052000-fc052fff : 0000:00:03.0
fc053000-fc053fff : 0000:00:04.0
fc054000-fc054fff : 0000:00:05.7
fc054000-fc054fff : ehci_hcd
fc055000-fc055fff : 0000:00:06.0
fec00000-fec003ff : IOAPIC 0
fee00000-fee00fff : Local APIC
feffc000-feffffff : reserved
fffc0000-ffffffff : reserved
100000000-4e6ffffff : Persistent Memory (legacy)
4e7000000-4e968bfff : System RAM
4e7000000-4e7b185d8 : Kernel code
4e7b185d9-4e83f54bf : Kernel data
4e876d000-4e965efff : Kernel bss
4e968c000-4ffffffff : Persistent Memory (legacy)
500000000-53fffffff : System RAM
The âSystem RAMâ region now sits in the middle of my âPersistent Memory
(legacy)â region, splitting it in half. This results in the following kernel
WARNING:
[ 6.356180] WARNING: CPU: 4 PID: 689 at kernel/memremap.c:300 devm_memremap_pages+0x3b2/0x4c0
[ 6.357757] devm_memremap_pages attempted on mixed region [mem 0x4e968c000-0x4ffffffff flags 0x200]
and no /dev/pmem* devices being created.