[RFC PATCH v5 1/3] memcg: Add memory.max.effective attribute

From: Michal Koutný
Date: Thu Jun 06 2024 - 11:24:30 EST


Some applications use memory cgroup limits to scale their own memory
needs. Reading of the immediate membership cgroup's memory.max is not
sufficient because of possible ancestral limits. The application could
traverse upwards to figure out the tightest limit but this would not
work in cgroup namespace where the view of cgroup hierarchy is
incomplete and the limit may apply from outer world.

(cgroup v1 used memory.stat:hierarchical_memory_limit to report the
value but there's no such counterpart in cgroup v2 memory.stat.)

Introduce a new memcg attribute file that contains the effective value
of memory limit for given cgroup (following cpuset.cpus.effective
pattern).

Signed-off-by: Jan Kratochvil (Azul) <jkratochvil@xxxxxxxx>
[ mkoutny: rewrite commit message, split out memory.swap.max]
Signed-off-by: Michal Koutný <mkoutny@xxxxxxxx>
---
Documentation/admin-guide/cgroup-v2.rst | 6 ++++++
mm/memcontrol.c | 18 ++++++++++++++++++
2 files changed, 24 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 8fbb0519d556..988f26264054 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1293,6 +1293,12 @@ PAGE_SIZE multiple when read back.
Caller could retry them differently, return into userspace
as -ENOMEM or silently ignore in cases like disk readahead.

+ memory.max.effective
+ A read-only file that provides effective value of cgroup's hard usage
+ limit. It incorporates limits of all ancestors, even those not visible
+ in cgroupns. The value change in this file generates a file modified
+ event.
+
memory.reclaim
A write-only nested-keyed file which exists for all cgroups.

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7fad15b2290c..86bcec84fe7b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -7065,6 +7065,19 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
return nbytes;
}

+static int memory_max_effective_show(struct seq_file *m, void *v)
+{
+ unsigned long memory;
+ struct mem_cgroup *mi;
+
+ /* Hierarchical information */
+ memory = PAGE_COUNTER_MAX;
+ for (mi = mem_cgroup_from_seq(m); mi; mi = parent_mem_cgroup(mi))
+ memory = min(memory, READ_ONCE(mi->memory.max));
+
+ return seq_puts_memcg_tunable(m, memory);
+}
+
/*
* Note: don't forget to update the 'samples/cgroup/memcg_event_listener'
* if any new events become available.
@@ -7259,6 +7272,11 @@ static struct cftype memory_files[] = {
.seq_show = memory_max_show,
.write = memory_max_write,
},
+ {
+ .name = "max.effective",
+ .flags = CFTYPE_NOT_ON_ROOT,
+ .seq_show = memory_max_effective_show,
+ },
{
.name = "events",
.flags = CFTYPE_NOT_ON_ROOT,
--
2.45.1