Re: [PATCH] memcg: add interface to force disable swap

From: Jianlin Lv
Date: Sun Oct 08 2023 - 03:52:33 EST


On Sun, Oct 8, 2023 at 9:17 AM Huang, Ying <ying.huang@xxxxxxxxx> wrote:
>
> Jianlin Lv <iecedge@xxxxxxxxx> writes:
>
> > From: Jianlin Lv <iecedge@xxxxxxxxx>
> >
> > Global reclaim will swap even if swappiness is set to 0.
>
> Why? Can you elaborate the situation?

We reproduced the issue of pages being swapped out even when swappiness is
set to 0 in the production environment through the following test program.
Not sure whether this program can reproduce the issue in any environment.

>From the implementation of the get_scan_count code, it can be seen that,
based on the current runtime situation, memory reclamation will choose a
scanning method (SCAN_ANON/SCAN_FILE/SCAN_FRACT) to determine how
aggressively the anon and file LRU are scanned. However, this introduces
uncertainty.

For the JVM issue at hand, we expect deterministic SCAN_FILE scan to avoid
swapping out anon pages.


code::
#!/usr/bin/env python

import mmap
import os
import sys

def write_files():
count = 1
if not os.path.isdir(WRITE_DIR):
os.mkdir(WRITE_DIR)

while True:
_, i = divmod(count, 6000)
file = "{}/{}_{}.txt".format(WRITE_DIR, WRITE_FILE, i)

with open(file, 'w') as f:
# Write 100 MB to a file
num_chars = 100 * 1024 * 1024
f.write('0' * num_chars)
count = count + 1

def create_read_file():
with open(READ_FILE, 'wb') as f:
num_chars = 10000 * 1024 * 1024
f.write(b'0' * num_chars)

def read_file():
with open(READ_FILE, mode="r", encoding="utf8") as f:
mm = mmap.mmap(f.fileno(), length=0, access=mmap.ACCESS_READ)
text = mm.read()
write_files()

WRITE_FILE = "file"
WRITE_DIR = "/tmp/rm_rf_me"
READ_FILE="/tmp/10g_file_delete"
if not os.path.isfile(READ_FILE):
create_read_file()
read_file()


Jianlin

>
> > In particular
> > case, users wish to be able to completely disable swap for specific
> > processes. One scenario is that if JVM memory pages falls into swap,
> > the performance will noticeably reduce and the GC pauses tend to increase
> > to levels not tolerable by most applications.
> > If it's possible to only disable swap out for specific processes, it can
> > address the JVM GC pauses issues, and at the same time, memory reclaim
> > pressure is also manageable.
> >
> > This patch adds "memory.swap_force_disable" control file to support disable
> > swap for non-root cgroup. When process is associated with a cgroup,
> > 'echo 1 > memory.swap_force_disable' will forbid anon pages be swapped out.
> > This patch also adds read and write handler of the control file.
>
> --
> Best Regards,
> Huang, Ying