Re: [PATCH v2 2/3] x86/mm/KASLR: Calculate the actual size of vmemmap region

From: Baoquan He
Date: Tue Sep 11 2018 - 23:18:18 EST

Next message: Ley Foon Tan: "Re: [PATCH v2 2/9] nios2: build .dtb files in dts directory"
Previous message: Jason Wang: "[PATCH net-next V2 11/11] vhost_net: batch submitting XDP buffers to underlayer sockets"
In reply to: Baoquan He: "Re: [PATCH v2 2/3] x86/mm/KASLR: Calculate the actual size of vmemmap region"
Next in thread: Ingo Molnar: "Re: [PATCH v2 2/3] x86/mm/KASLR: Calculate the actual size of vmemmap region"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 09/11/18 at 08:08pm, Baoquan He wrote:
> On 09/11/18 at 11:28am, Ingo Molnar wrote:
> > Yeah, so proper context is still missing, this paragraph appears to assume from the reader a
> > whole lot of prior knowledge, and this is one of the top comments in kaslr.c so there's nowhere
> > else to go read about the background.
> >
> > For example what is the range of randomization of each region? Assuming the static,
> > non-randomized description in Documentation/x86/x86_64/mm.txt is correct, in what way does
> > KASLR modify that layout?

Re-read this paragraph, found I missed saying the range for each memory
region, and in what way KASLR modify the layout.

> >
> > All of this is very opaque and not explained very well anywhere that I could find. We need to
> > generate a proper description ASAP.
>
> OK, let me try to give an context with my understanding. And copy the
> static layout of memory regions at below for reference.
>
Here, Documentation/x86/x86_64/mm.txt is correct, and it's the
guideline for us to manipulate the layout of kernel memory regions.
Originally the starting address of each region is aligned to 512GB
so that they are all mapped at the 0-th entry of PGD table in 4-level
page mapping. Since we are so rich to have 120 TB virtual address space,
they are aligned at 1 TB actually. So randomness comes from three parts
mainly:

1) The direct mapping region for physical memory. 64 TB are reserved to
cover the maximum physical memory support. However, most of systems only
have much less RAM memory than 64 TB, even much less than 1 TB most of
time. We can take the superfluous to join the randomization. This is
often the biggest part.

2) The hole between memory regions, even though they are only 1 TB.

3) KASAN region takes up 16 TB, while it won't take effect when KASLR is
enabled. This is another big part.

As you can see, in these three memory regions, the physical memory
mapping region has variable size according to the existing system RAM.
However, the remaining two memory regions have fixed size, vmalloc is 32
TB, vmemmap is 1 TB.

With this superfluous address space as well as changing the starting address
of each memory region to be PUD level, namely 1 GB aligned, we can have
thousands of candidate position to locate those three memory regions.

Above is for 4-level paging mode . As for 5-level, since the virtual
address space is too big, Kirill makes the starting address of regions
P4D aligned, namely 512 GB.

When randomize the layout, their order are kept, still the physical
memory mapping region is handled fistly, next vmalloc and vmemmap. Let's
take the physical memory mapping region as example, we limit the
starting address to be taken from the 1st 1/3 part of the whole
available virtual address space which is from 0xffff880000000000 to
0xfffffe0000000000, namely the original starting address of the physical
memory mapping region to the starting address of cpu_entry_area mapping
region. Once a random address is chosen for the physical memory mapping,
we jump over the region and add 1G to begin the next region handling
with the remaining available space.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ffff880000000000 - ffffc7ffffffffff (=64 TB) direct mapping of all phys. memory
136T - 200T = 64TB
ffffc80000000000 - ffffc8ffffffffff (=40 bits) hole
200T - 201T = 1TB
ffffc90000000000 - ffffe8ffffffffff (=45 bits) vmalloc/ioremap space
201T - 233T = 32TB
ffffe90000000000 - ffffe9ffffffffff (=40 bits) hole
233T - 234T = 1TB
ffffea0000000000 - ffffeaffffffffff (=40 bits) virtual memory map (1TB)
234T - 235T = 1TB
... unused hole ...
ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
236T - 252T = 16TB
... unused hole ...
vaddr_end for KASLR
fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
254T - 254T+512G

Thanks
Baoquan

Next message: Ley Foon Tan: "Re: [PATCH v2 2/9] nios2: build .dtb files in dts directory"
Previous message: Jason Wang: "[PATCH net-next V2 11/11] vhost_net: batch submitting XDP buffers to underlayer sockets"
In reply to: Baoquan He: "Re: [PATCH v2 2/3] x86/mm/KASLR: Calculate the actual size of vmemmap region"
Next in thread: Ingo Molnar: "Re: [PATCH v2 2/3] x86/mm/KASLR: Calculate the actual size of vmemmap region"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]