Re: Bug report about KASLR and ZONE_MOVABLE

From: Chao Fan
Date: Wed Jul 11 2018 - 06:18:37 EST


More explanation:

If there is a machine with 10 nodes, and memory size in each node is
20G. Then 'kernelcore=100G' will set last 10G memory in each node as
ZONE_MOVABLE.
But if KASLR put kernel to 19G position of first node, the regions
can not be offlined. So we should set the last 1G of first kernel
and last 11G as ZONE_MOVABLE of other 9 nodes as ZONE_MOVABLE.

Thanks,
Chao Fan

On Wed, Jul 11, 2018 at 05:42:44PM +0800, Chao Fan wrote:
>Hi all,
>
>I found there is a BUG about KASLR and ZONE_MOVABLE.
>
>When users use 'kernelcore=' parameter without 'movable_node',
>movable memory is evenly distributed to all nodes. The size of
>ZONE_MOVABLE depends on the kernel parameter 'kernelcore=' and
>'movablecore='.
>But sometiomes, KASLR may put the uncompressed kernel to the
>tail position of a node, which will cause the kernel memory
>set as ZONE_MOVABLE. This region can not be offlined.
>
>Here is a very simple test in my qemu-kvm machine, there is
>only one node:
>
>The command line:
>[root@localhost ~]# cat /proc/cmdline
>BOOT_IMAGE=/vmlinuz-4.18.0-rc3+ root=/dev/mapper/fedora_localhost--live-root
>ro resume=/dev/mapper/fedora_localhost--live-swap
>rd.lvm.lv=fedora_localhost-live/root rd.lvm.lv=fedora_localhost-live/swap
>console=ttyS0 earlyprintk=ttyS0,115200n8 memblock=debug kernelcore=50%
>
>I use 'kernelcore=50%' here.
>
>Here is my early print result, I print the random_addr after KASLR chooses
>physical memory:
>early console in extract_kernel
>input_data: 0x000000000266b3b1
>input_len: 0x00000000007d8802
>output: 0x0000000001000000
>output_len: 0x0000000001e15698
>kernel_total_size: 0x0000000001a8b000
>trampoline_32bit: 0x000000000009d000
>booted via startup_32()
>Physical KASLR using RDRAND RDTSC...
>random_addr: 0x000000012f000000
>Virtual KASLR using RDRAND RDTSC...
>
>The address for kernel is 0x000000012f000000
>
>Here is the log of ZONE:
>[ 0.000000] Zone ranges:
>[ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff]
>[ 0.000000] DMA32 [mem 0x0000000001000000-0x00000000ffffffff]
>[ 0.000000] Normal [mem 0x0000000100000000-0x00000001f57fffff]
>[ 0.000000] Device empty
>[ 0.000000] Movable zone start for each node
>[ 0.000000] Node 0: 0x000000011b000000
>[ 0.000000] Early memory node ranges
>[ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009efff]
>[ 0.000000] node 0: [mem 0x0000000000100000-0x00000000bffd6fff]
>[ 0.000000] node 0: [mem 0x0000000100000000-0x00000001f57fffff]
>[ 0.000000] Initmem setup node 0 [mem
>0x0000000000001000-0x00000001f57fffff]
>
>Only one node in my machine, ZONE_MOVABLE begins from 0x000000011b000000,
>which is lower than 0x000000012f000000.
>So KASLR put the kernel to the ZONE_MOVABLE.
>Try to solve this problem, I think there should be a new tactic in function
>find_zone_movable_pfns_for_nodes() of mm/page_alloc.c. If kernel is uncompressed
>in a tail position, then just set the memory after the kernel as ZONE_MOVABLE,
>at the same time, memory in other nodes will be set as ZONE_MOVABLE.
>
>If there is something wrong, pleas let me know. And any comments will be welcome.
>
>Thanks,
>Chao Fan