KASLR may break some kernel features (was Re: [PATCH v5 1/4] kaslr: add immovable_mem=nn[KMG]@ss[KMG] to specify extracting memory)

From: Luiz Capitulino
Date: Thu Jan 04 2018 - 11:21:12 EST


On Thu, 4 Jan 2018 18:30:57 +0800
Baoquan He <bhe@xxxxxxxxxx> wrote:

> On 01/04/18 at 04:02pm, Chao Fan wrote:
> > In current code, kaslr may choose the memory region in movable
> > nodes to extract kernel, which will make the nodes can't be hot-removed.
> > To solve it, we can specify the memory region in immovable node.
> > Create immovable_mem to store the regions in immovable_mem, where should
> > be chosen by kaslr.

[...]

> Hi Chao,
>
> Thanks for your effort on this issue.
>
> Luiz told me they met a hugetlb issue when kaslr enabled on kvm guest.
> Please check the below bug information. There's only one available
> position which hugepage can use to allocate. In this case, if we have a
> generic parameter to tell kernel where we can randomize into, this
> hugepage issue can be solved. We can restrict kernel to randomize beyond
> [0x40000000, 0x7fffffff]. Not sure if your immovable_mem=nn[KMG]@ss[KMG]
> can be adjusted to do this. I am hesitating on whether we should change
> this or not.

Having a generic kaslr parameter to control where the kernel is extracted
is one solution for this problem.

The general problem statement is that KASLR may break some kernel features
depending on where the kernel is extracted. Two examples are hot-plugged
memory (this series) and 1GB HugeTLB pages.

The 1GB HugeTLB page issue is not specific to KVM guests. It just happens
that there's a bunch of people running guests with up to 5GB of memory and
with that amount of memory you have one or two 1GB pages and is easier for
KASLR to extract the kernel into a 1GB region and split a 1GB page. So,
you may not get any 1GB pages at all when this happens. However, I can also
reproduce this on bare-metal with lots of memory where I can loose a 1GB
page from time to time.

Having a kaslr_range= parameter solves both issues, but two major drawbacks
is that it breaks existing setups and I guess users will have a very hard
time choosing good ranges.

Another idea would be to have a CONFIG_KASLR_RANGES, where each arch
could have a list of ranges known to contain holes and/or immovable
memory and only extract the kernel into those ranges.