Re: [RFC PATCH] mm: memcontrol: memory+swap accounting for cgroup-v2

From: Michal Hocko
Date: Tue Dec 19 2017 - 07:49:46 EST


On Mon 18-12-17 16:01:31, Shakeel Butt wrote:
> The memory controller in cgroup v1 provides the memory+swap (memsw)
> interface to account to the combined usage of memory and swap of the
> jobs. The memsw interface allows the users to limit or view the
> consistent memory usage of their jobs irrespectibe of the presense of
> swap on the system (consistent OOM and memory reclaim behavior). The
> memory+swap accounting makes the job easier for centralized systems
> doing resource usage monitoring, prediction or anomaly detection.
>
> In cgroup v2, the 'memsw' interface was dropped and a new 'swap'
> interface has been introduced which allows to limit the actual usage of
> swap by the job. For the systems where swap is a limited resource,
> 'swap' interface can be used to fairly distribute the swap resource
> between different jobs. There is no easy way to limit the swap usage
> using the 'memsw' interface.
>
> However for the systems where the swap is cheap and can be increased
> dynamically (like remote swap and swap on zram), the 'memsw' interface
> is much more appropriate as it makes swap transparent to the jobs and
> gives consistent memory usage history to centralized monitoring systems.
>
> This patch adds memsw interface to cgroup v2 memory controller behind a
> mount option 'memsw'. The memsw interface is mutually exclusive with
> the existing swap interface. When 'memsw' is enabled, reading or writing
> to 'swap' interface files will return -ENOTSUPP and vice versa. Enabling
> or disabling memsw through remounting cgroup v2, will only be effective
> if there are no decendants of the root cgroup.
>
> When memsw accounting is enabled then "memory.high" is comapred with
> memory+swap usage. So, when the allocating job's memsw usage hits its
> high mark, the job will be throttled by triggering memory reclaim.

>From a quick look, this looks like a mess. We have agreed to go with
the current scheme for some good reasons. There are cons/pros for both
approaches but I am not convinced we should convolute the user API for
the usecase you describe.

> Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx>
--
Michal Hocko
SUSE Labs