Re: [PATCH][RFC] mm: Introduce kernelcore=reliable option
From: Kamezawa Hiroyuki
Date: Fri Oct 09 2015 - 05:25:17 EST
On 2015/10/09 15:46, Xishi Qiu wrote:
On 2015/10/9 22:56, Taku Izumi wrote:
Xeon E7 v3 based systems supports Address Range Mirroring
and UEFI BIOS complied with UEFI spec 2.5 can notify which
ranges are reliable (mirrored) via EFI memory map.
Now Linux kernel utilize its information and allocates
boot time memory from reliable region.
My requirement is:
- allocate kernel memory from reliable region
- allocate user memory from non-reliable region
In order to meet my requirement, ZONE_MOVABLE is useful.
By arranging non-reliable range into ZONE_MOVABLE,
reliable memory is only used for kernel allocations.
Hi Taku,
You mean set non-mirrored memory to movable zone, and set
mirrored memory to normal zone, right? So kernel allocations
will use mirrored memory in normal zone, and user allocations
will use non-mirrored memory in movable zone.
My question is:
1) do we need to change the fallback function?
For *our* requirement, it's not required. But if someone want to prevent
user's memory allocation from NORMAL_ZONE, we need some change in zonelist
walking.
2) the mirrored region should locate at the start of normal
zone, right?
Precisely, "not-reliable" range of memory are handled by ZONE_MOVABLE.
This patch does only that.
I remember Kame has already suggested this idea. In my opinion,
I still think it's better to add a new migratetype or a new zone,
so both user and kernel could use mirrored memory.
Hi, Xishi.
I and Izumi-san discussed the implementation much and found using "zone"
is better approach.
The biggest reason is that zone is a unit of vmscan and all statistics and
handling the range of memory for a purpose. We can reuse all vmscan and
information codes by making use of zones. Introdcing other structure will be messy.
His patch is very simple.
For your requirements. I and Izumi-san are discussing following plan.
- Add a flag to show the zone is reliable or not, then, mark ZONE_MOVABLE as not-reliable.
- Add __GFP_RELIABLE. This will allow alloc_pages() to skip not-reliable zone.
- Add madivse() MADV_RELIABLE and modify page fault code's gfp flag with that flag.
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/