Re: [PATCH v2] ARC: fix memory nodes topology in case of highmem enabled

From: Eugeniy Paltsev
Date: Mon May 13 2019 - 09:14:55 EST


Hi Vineet,

ping.

On Wed, 2019-04-17 at 13:46 +0300, Eugeniy Paltsev wrote:
> Tweak generic node topology in case of CONFIG_HIGHMEM enabled to
> prioritize allocations from ZONE_HIGHMEM to avoid ZONE_NORMAL
> pressure.
>
> Here is example when we can see problems on ARC with currently
> existing topology configuration:
>
> Generic statements:
> - *NOT* every memory allocation which could be done from
> ZONE_NORMAL also could be done from ZONE_HIGHMEM.
> - Every memory allocation which could be done from ZONE_HIGHMEM
> also could be done from ZONE_NORMAL (In other words ZONE_NORMAL
> is more universal than ZONE_HIGHMEM)
>
> ARC statements:
> In case of CONFIG_HIGHMEM enabled we have 2 memory nodes:
> - "node 0" has only ZONE_NORMAL memory.
> - "node 1" has only ZONE_HIGHMEM memory.
>
> Steps to reproduce the problem:
> 1) Let's try to allocate some memory from userspace. It can be
> allocate from anywhere (ZONE_HIGHMEM/ZONE_NORMAL).
> 2) Kernel tries to allocate memory from the closest memory node
> to this CPU. As we don't have NUMA enabled and don't override
> any define from "include/asm-generic/topology.h" the closest
> memory node to any CPU will be "node 0"
> 3) OK, we'll allocate memory from "node 0". Let's choose ZONE
> to allocate from. This allocation could be done from both
> ZONE_HIGHMEM / ZONE_NORMAL in this node. The allocation
> priority between zones is ZONE_HIGHMEM > ZONE_NORMAL.
> This is pretty logical - we don't want waste *universal*
> ZONE_NORMAL if we can use ZONE_HIGHMEM. But we don't have
> ZONE_HIGHMEM in "node 0" that's why we rollback to
> ZONE_NORMAL and allocate memory from it.
> 4) Let's try to allocate a lot of memory [more than we have free
> memory in lowmem] from userspace.
> 5) Kernel allocates as much memory as it can from the closest
> memory node ("node 0"). But there is no enough memory in
> "node 0". So we'll rollback to another memory node ("node 1")
> and allocate the rest of the amount from it.
>
> In other words we have following memory lookup path:
> (node 0, ZONE_HIGHMEM) ->
> -> (node 0, ZONE_NORMAL) ->
> -> (node 1, ZONE_HIGHMEM)
>
> Now we don't have any free memory in (node 0, ZONE_NORMAL)
> [Actually this is a simplification, but it doesn't matter
> in this example]
> 6) Oops, some internal kernel memory allocation happen which
> requires ZONE_NORMAL. For example "kmalloc(size, GFP_KERNEL)"
> was called.
> So the we have following memory lookup path:
> (node 0, ZONE_NORMAL) -> ("node 1", ZONE_NORMAL)
> There is no free memory in "node 0". And there is no
> ZONE_NORMAL in "node 1". We only have some free memory in
> (node 1, ZONE_HIGHMEM) but HIGHMEM isn't suitable in this
> case.
> 7) As we can't allocate memory OOM-Killer is invoked, even if
> we have some free memory in (node 1, ZONE_HIGHMEM).
>
> This patch tweaks generic node topology and mark memory from
> "node 1" as the closest to any CPU.
>
> So the we'll have following memory lookup path:
> (node 1, ZONE_HIGHMEM) ->
> -> (node 1, ZONE_NORMAL) ->
> -> (node 0, ZONE_HIGHMEM) ->
> -> (node 0, ZONE_NORMAL)
> In case of node configuration on ARC we obtain the degenerate case
> of this path:
> (node 1, ZONE_HIGHMEM) -> (node 0, ZONE_NORMAL)
>
> In this case we don't waste *universal* ZONE_NORMAL if we can use
> ZONE_HIGHMEM so we don't face with the issue pointed in [5-7]
>
> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@xxxxxxxxxxxx>
> ---
> Changes v1->v2:
> * Changes in commit message and comments in a code. No functional
> change intended.
>
> arch/arc/include/asm/Kbuild | 1 -
> arch/arc/include/asm/topology.h | 24 ++++++++++++++++++++++++
> 2 files changed, 24 insertions(+), 1 deletion(-)
> create mode 100644 arch/arc/include/asm/topology.h
>
> diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild
> index caa270261521..e64e0439baff 100644
> --- a/arch/arc/include/asm/Kbuild
> +++ b/arch/arc/include/asm/Kbuild
> @@ -18,7 +18,6 @@ generic-y += msi.h
> generic-y += parport.h
> generic-y += percpu.h
> generic-y += preempt.h
> -generic-y += topology.h
> generic-y += trace_clock.h
> generic-y += user.h
> generic-y += vga.h
> diff --git a/arch/arc/include/asm/topology.h b/arch/arc/include/asm/topology.h
> new file mode 100644
> index 000000000000..c3b8ab7ed011
> --- /dev/null
> +++ b/arch/arc/include/asm/topology.h
> @@ -0,0 +1,24 @@
> +#ifndef _ASM_ARC_TOPOLOGY_H
> +#define _ASM_ARC_TOPOLOGY_H
> +
> +/*
> + * On ARC (w/o PAE) HIGHMEM addresses are smaller (0x0 based) than addresses in
> + * NORMAL aka low memory (0x8000_0000 based).
> + * Thus HIGHMEM on ARC is implemented with DISCONTIGMEM which requires multiple
> + * nodes. So here is memory node map on ARC:
> + * - node 0: ZONE_NORMAL memory (always)
> + * - node 1: ZONE_HIGHMEM memory (only if CONFIG_HIGHMEM is enabled)
> + *
> + * In case of CONFIG_HIGHMEM enabled we tweak generic node topology and mark
> + * node 1 as the closest to all CPUs to prioritize allocations from ZONE_HIGHMEM
> + * where it is possible to avoid ZONE_NORMAL pressure.
> + */
> +#ifdef CONFIG_HIGHMEM
> +#define cpu_to_node(cpu) ((void)(cpu), 1)
> +#define cpu_to_mem(cpu) ((void)(cpu), 1)
> +#define cpumask_of_node(node) ((node) == 1 ? cpu_online_mask : cpu_none_mask)
> +#endif /* CONFIG_HIGHMEM */
> +
> +#include <asm-generic/topology.h>
> +
> +#endif /* _ASM_ARC_TOPOLOGY_H */
--
Eugeniy Paltsev