Re: [RFC 2/8] mm: Add specialized fallback zonelist for coherent device memory nodes

From: Anshuman Khandual
Date: Thu Nov 17 2016 - 02:41:54 EST


On 10/24/2016 10:01 AM, Anshuman Khandual wrote:
> This change is part of the isolation requiring coherent device memory
> node's implementation.
>
> Isolation seeking coherent memory node requires isolation from implicit
> memory allocations from user space but at the same time there should also
> have an explicit way to do the allocation. Kernel allocation to this memory
> can be prevented by putting the entire memory in ZONE_MOVABLE for example.
>
> Platform node's both zonelists are fundamental to where the memory comes
> when there is an allocation request. In order to achieve the two objectives
> stated above, zonelists building process has to change as both zonelists
> (FALLBACK and NOFALLBACK) gives access to the node's memory zones during
> any kind of memory allocation. The following changes are implemented in
> this regard.
>
> (1) Coherent node's zones are not part of any other node's FALLBACK list
> (2) Coherent node's FALLBACK list contains it's own memory zones followed
> by all system RAM zones in normal order
> (3) Coherent node's zones are part of it's own NOFALLBACK list
>
> The above changes which will ensure the following which in turn isolates
> the coherent memory node as desired.
>
> (1) There wont be any implicit allocation ending up in the coherent node
> (2) __GFP_THISNODE marked allocations will come from the coherent node
> (3) Coherent memory can also be allocated through MPOL_BIND interface
>
> Sample zonelist configuration:
>
> [NODE (0)] System RAM node
> ZONELIST_FALLBACK (0xc00000000140da00)
> (0) (node 0) (DMA 0xc00000000140c000)
> (1) (node 1) (DMA 0xc000000100000000)
> ZONELIST_NOFALLBACK (0xc000000001411a10)
> (0) (node 0) (DMA 0xc00000000140c000)
> [NODE (1)] System RAM node
> ZONELIST_FALLBACK (0xc000000100001a00)
> (0) (node 1) (DMA 0xc000000100000000)
> (1) (node 0) (DMA 0xc00000000140c000)
> ZONELIST_NOFALLBACK (0xc000000100005a10)
> (0) (node 1) (DMA 0xc000000100000000)
> [NODE (2)] Coherent memory
> ZONELIST_FALLBACK (0xc000000001427700)
> (0) (node 2) (Movable 0xc000000001427080)
> (1) (node 0) (DMA 0xc00000000140c000)
> (2) (node 1) (DMA 0xc000000100000000)
> ZONELIST_NOFALLBACK (0xc00000000142b710)
> (0) (node 2) (Movable 0xc000000001427080)
> [NODE (3)] Coherent memory
> ZONELIST_FALLBACK (0xc000000001431400)
> (0) (node 3) (Movable 0xc000000001430d80)
> (1) (node 0) (DMA 0xc00000000140c000)
> (2) (node 1) (DMA 0xc000000100000000)
> ZONELIST_NOFALLBACK (0xc000000001435410)
> (0) (node 3) (Movable 0xc000000001430d80)
> [NODE (4)] Coherent memory
> ZONELIST_FALLBACK (0xc00000000143b100)
> (0) (node 4) (Movable 0xc00000000143aa80)
> (1) (node 0) (DMA 0xc00000000140c000)
> (2) (node 1) (DMA 0xc000000100000000)
> ZONELIST_NOFALLBACK (0xc00000000143f110)
> (0) (node 4) (Movable 0xc00000000143aa80)
>
> Signed-off-by: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx>
> ---

Another way of achieving isolation of the CDM nodes from user space
allocations would be through cpuset changes. Will be sending out
couple of draft patches in this direction. Then we can look into
whether the current method or the cpuset method is a better way to
go forward.