[RFC 2/8] mm: Add specialized fallback zonelist for coherent device memory nodes

From: Anshuman Khandual
Date: Mon Oct 24 2016 - 00:32:39 EST


This change is part of the isolation requiring coherent device memory
node's implementation.

Isolation seeking coherent memory node requires isolation from implicit
memory allocations from user space but at the same time there should also
have an explicit way to do the allocation. Kernel allocation to this memory
can be prevented by putting the entire memory in ZONE_MOVABLE for example.

Platform node's both zonelists are fundamental to where the memory comes
when there is an allocation request. In order to achieve the two objectives
stated above, zonelists building process has to change as both zonelists
(FALLBACK and NOFALLBACK) gives access to the node's memory zones during
any kind of memory allocation. The following changes are implemented in
this regard.

(1) Coherent node's zones are not part of any other node's FALLBACK list
(2) Coherent node's FALLBACK list contains it's own memory zones followed
by all system RAM zones in normal order
(3) Coherent node's zones are part of it's own NOFALLBACK list

The above changes which will ensure the following which in turn isolates
the coherent memory node as desired.

(1) There wont be any implicit allocation ending up in the coherent node
(2) __GFP_THISNODE marked allocations will come from the coherent node
(3) Coherent memory can also be allocated through MPOL_BIND interface

Sample zonelist configuration:

[NODE (0)] System RAM node
ZONELIST_FALLBACK (0xc00000000140da00)
(0) (node 0) (DMA 0xc00000000140c000)
(1) (node 1) (DMA 0xc000000100000000)
ZONELIST_NOFALLBACK (0xc000000001411a10)
(0) (node 0) (DMA 0xc00000000140c000)
[NODE (1)] System RAM node
ZONELIST_FALLBACK (0xc000000100001a00)
(0) (node 1) (DMA 0xc000000100000000)
(1) (node 0) (DMA 0xc00000000140c000)
ZONELIST_NOFALLBACK (0xc000000100005a10)
(0) (node 1) (DMA 0xc000000100000000)
[NODE (2)] Coherent memory
ZONELIST_FALLBACK (0xc000000001427700)
(0) (node 2) (Movable 0xc000000001427080)
(1) (node 0) (DMA 0xc00000000140c000)
(2) (node 1) (DMA 0xc000000100000000)
ZONELIST_NOFALLBACK (0xc00000000142b710)
(0) (node 2) (Movable 0xc000000001427080)
[NODE (3)] Coherent memory
ZONELIST_FALLBACK (0xc000000001431400)
(0) (node 3) (Movable 0xc000000001430d80)
(1) (node 0) (DMA 0xc00000000140c000)
(2) (node 1) (DMA 0xc000000100000000)
ZONELIST_NOFALLBACK (0xc000000001435410)
(0) (node 3) (Movable 0xc000000001430d80)
[NODE (4)] Coherent memory
ZONELIST_FALLBACK (0xc00000000143b100)
(0) (node 4) (Movable 0xc00000000143aa80)
(1) (node 0) (DMA 0xc00000000140c000)
(2) (node 1) (DMA 0xc000000100000000)
ZONELIST_NOFALLBACK (0xc00000000143f110)
(0) (node 4) (Movable 0xc00000000143aa80)

Signed-off-by: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx>
---
mm/page_alloc.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2b3bf67..a2536b4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4753,6 +4753,16 @@ static void build_zonelists(pg_data_t *pgdat)
i = 0;

while ((node = find_next_best_node(local_node, &used_mask)) >= 0) {
+#ifdef CONFIG_COHERENT_DEVICE
+ /*
+ * Isolation requiring coherent device memory node's zones
+ * should not be part of any other node's fallback zonelist
+ * but it's own fallback list.
+ */
+ if (isolated_cdm_node(node) && (pgdat->node_id != node))
+ continue;
+#endif
+
/*
* We don't want to pressure a particular node.
* So adding penalty to the first node in same
--
2.1.0