Re: KASLR may break some kernel features (was Re: [PATCH v5 1/4] kaslr: add immovable_mem=nn[KMG]@ss[KMG] to specify extracting memory)

From: Luiz Capitulino
Date: Fri Jan 12 2018 - 13:52:12 EST


On Fri, 12 Jan 2018 10:47:53 +0800
Chao Fan <fanc.fnst@xxxxxxxxxxxxxx> wrote:

> On Fri, Jan 12, 2018 at 10:31:52AM +0800, Baoquan He wrote:
> >On 01/11/18 at 10:04am, Kees Cook wrote:
> >> On Thu, Jan 11, 2018 at 1:00 AM, Baoquan He <bhe@xxxxxxxxxx> wrote:
> >> > Hi Luiz,
> >> >
> >> > On 01/04/18 at 11:21am, Luiz Capitulino wrote:
> >> >> Having a generic kaslr parameter to control where the kernel is extracted
> >> >> is one solution for this problem.
> >> >>
> >> >> The general problem statement is that KASLR may break some kernel features
> >> >> depending on where the kernel is extracted. Two examples are hot-plugged
> >> >> memory (this series) and 1GB HugeTLB pages.
> >> >>
> >> >> The 1GB HugeTLB page issue is not specific to KVM guests. It just happens
> >> >> that there's a bunch of people running guests with up to 5GB of memory and
> >> >> with that amount of memory you have one or two 1GB pages and is easier for
> >> >> KASLR to extract the kernel into a 1GB region and split a 1GB page. So,
> >> >> you may not get any 1GB pages at all when this happens. However, I can also
> >> >> reproduce this on bare-metal with lots of memory where I can loose a 1GB
> >> >> page from time to time.
> >> >>
> >> >> Having a kaslr_range= parameter solves both issues, but two major drawbacks
> >> >> is that it breaks existing setups and I guess users will have a very hard
> >> >> time choosing good ranges.
> >> >>
> >> >> Another idea would be to have a CONFIG_KASLR_RANGES, where each arch
> >> >> could have a list of ranges known to contain holes and/or immovable
> >> >> memory and only extract the kernel into those ranges.
> >> >
> >> > If add CONFIG_KASLR_RANGES, then a distro like RHEL will have this range
> >> > always, whether people need hugetlb or not.
> >> >
> >> > So in this case, what range do we need to avoid? Only [1G, 2G]?
> >>
> >> Any ranges like that that need to be avoided should be known at build
> >> time, so they should simply be added to the mem_avoid list that is
> >> already present in the KASLR code...
> >
> >Seems KASLR doesn't have an solution which allow user to specify avoided
> >range for kernel text KASLR stage only. The memmap="!#$" can add range to
> >mem_avoid, while it will make them not added to e820.
> >
>
> How about adding a new option, like "huge_page=nn@ss". Fill the regions
> to mem_avoid. But this parameter will only be parsed in kaslr period.
> The followed handlling of memmap will not be excuted.

If we add a new option, I think we should try to make general enough
to satisfy both hugepages and the memory hotplug problem. Otherwise
we'll end up adding a new option for each feature KASLR breaks...

However, in the case of the 1GB page problem, I'm starting to think
that it may be possible to know which 1GB areas are already fragmented
and extract the kernel to one of those areas. I don't know if this would
help the memory hotplug issue though.