[PATCH 0/3] mm: memcontrol: recursive memory protection

From: Johannes Weiner
Date: Fri Dec 13 2019 - 14:22:14 EST


The current memory.low (and memory.min) semantics require protection
to be assigned to a cgroup in an untinterrupted chain from the
top-level cgroup all the way to the leaf.

In practice, we want to protect entire cgroup subtrees from each other
(system management software vs. workload), but we would like the VM to
balance memory optimally *within* each subtree, without having to make
explicit weight allocations among individual components. The current
semantics make that impossible.

This patch series extends memory.low/min such that the knobs apply
recursively to the entire subtree. Users can still assign explicit
protection to subgroups, but if they don't, the protection set by the
parent cgroup will be distributed dynamically such that children
compete freely - as if no memory control were enabled inside the
subtree - but enjoy protection from neighboring trees.

Patch #1 fixes an existing bug that can give a cgroup tree more
protection than it should receive as per ancestor configuration.

Patch #2 simplifies and documents the existing code to make it easier
to reason about the changes in the next patch.

Patch #3 finally implements recursive memory protection semantics.

Because of a risk of regressing legacy setups, the new semantics are
hidden behind a cgroup2 mount option, 'memory_recursiveprot'.

More details in patch #3.

Documentation/admin-guide/cgroup-v2.rst | 11 ++
include/linux/cgroup-defs.h | 5 +
kernel/cgroup/cgroup.c | 17 ++-
mm/memcontrol.c | 241 +++++++++++++++++++-----------
mm/page_counter.c | 12 +-
5 files changed, 190 insertions(+), 96 deletions(-)