Re: [PATCH 1/7] mm: memcontrol: charge swap to cgroup2

From: Kamezawa Hiroyuki
Date: Tue Dec 15 2015 - 22:58:35 EST


On 2015/12/16 2:21, Michal Hocko wrote:
I completely agree that malicious/untrusted users absolutely have to
be capped by the hard limit. Then the separate swap limit would work
for sure. But I am less convinced about usefulness of the rigid (to
the global memory pressure) swap limit without the hard limit. All the
memory that could have been swapped out will make a memory pressure to
the rest of the system without being punished for it too much. Memcg
is allowed to grow over the high limit (in the current implementation)
without any way to shrink back in other words.

My understanding was that the primary use case for the swap limit is to
handle potential (not only malicious but also unexpectedly misbehaving
application) anon memory consumption runaways more gracefully without
the massive disruption on the global level. I simply didn't see swap
space partitioning as important enough because an alternative to swap
usage is to consume primary memory which is a more precious resource
IMO. Swap storage is really cheap and runtime expandable resource which
is not the case for the primary memory in general. Maybe there are other
use cases I am not aware of, though. Do you want to guarantee the swap
availability?


At the first implementation, NEC guy explained their use case in HPC area.
At that time, there was no swap support.

Considering 2 workloads partitioned into group A, B. total swap was 100GB.
A: memory.limit = 40G
B: memory.limit = 40G

Job scheduler runs applications in A and B in turn. Apps in A stops while Apps in B running.

If App-A requires 120GB of anonymous memory, it uses 80GB of swap. So, App-B can use only
20GB of swap. This can cause trouble if App-B needs 100GB of anonymous memory.
They need some knob to control amount of swap per cgroup.

The point is, at least for their customer, the swap is "resource", which should be under
control. With their use case, memory usage and swap usage has the same meaning. So,
mem+swap limit doesn't cause trouble.

Thanks,
-Kame




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/