Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines

From: Bruce Merry
Date: Tue Jul 24 2018 - 06:05:40 EST


On 18 July 2018 at 19:40, Bruce Merry <bmerry@xxxxxxxxx> wrote:
>> Yes, very easy to produce zombies, though I don't think kernel
>> provides any way to tell how many zombies exist on the system.
>>
>> To create a zombie, first create a memcg node, enter that memcg,
>> create a tmpfs file of few KiBs, exit the memcg and rmdir the memcg.
>> That memcg will be a zombie until you delete that tmpfs file.
>
> Thanks, that makes sense. I'll see if I can reproduce the issue.

Hi

I've had some time to experiment with this issue, and I've now got a
way to reproduce it fairly reliably, including with a stock 4.17.8
kernel. However, it's very phase-of-the-moon stuff, and even
apparently trivial changes (like switching the order in which the
files are statted) makes the issue disappear.

To reproduce:
1. Start cadvisor running. I use the 0.30.2 binary from Github, and
run it with sudo ./cadvisor-0.30.2 --logtostderr=true
2. Run the Python 3 script below, which repeatedly creates a cgroup,
enters it, stats some files in it, and leaves it again (and removes
it). It takes a few minutes to run.
3. time cat /sys/fs/cgroup/memory/memory.stat. It now takes about 20ms for me.
4. sudo sysctl vm.drop_caches=2
5. time cat /sys/fs/cgroup/memory/memory.stat. It is back to 1-2ms.

I've also added some code to memcg_stat_show to report the number of
cgroups in the hierarchy (iterations in for_each_mem_cgroup_tree).
Running the script increases it from ~700 to ~41000. The script
iterates 250,000 times, so only some fraction of the cgroups become
zombies.

I also tried the suggestion of force_empty: it makes the problem go
away, but is also very, very slow (about 0.5s per iteration), and
given the sensitivity of the test to small changes I don't know how
meaningful that is.

Reproduction code (if you have tqdm installed you get a nice progress
bar, but not required). Hopefully Gmail doesn't do any format
mangling:


#!/usr/bin/env python3
import os

try:
from tqdm import trange as range
except ImportError:
pass


def clean():
try:
os.rmdir(name)
except FileNotFoundError:
pass


def move_to(cgroup):
with open(cgroup + '/tasks', 'w') as f:
print(pid, file=f)


pid = os.getpid()
os.chdir('/sys/fs/cgroup/memory')
name = 'dummy'
N = 250000
clean()
try:
for i in range(N):
os.mkdir(name)
move_to(name)
for filename in ['memory.stat', 'memory.swappiness']:
os.stat(os.path.join(name, filename))
move_to('user.slice')
os.rmdir(name)
finally:
move_to('user.slice')
clean()


Regards
Bruce
--
Bruce Merry
Senior Science Processing Developer
SKA South Africa