[PATCH] mm/vmscan: fix infinite loop in drop_slab_node

From: zangchunxin
Date: Tue Sep 08 2020 - 10:32:46 EST


From: Chunxin Zang <zangchunxin@xxxxxxxxxxxxx>

On our server, there are about 10k memcg in one machine. They use memory
very frequently. When I tigger drop caches,the process will infinite loop
in drop_slab_node.
There are two reasons:
1.We have too many memcgs, even though one object freed in one memcg, the
sum of object is bigger than 10.
2.We spend a lot of time in traverse memcg once. So, the memcg who
traversed at the first have been freed many objects. Traverse memcg next
time, the freed count bigger than 10 again.

We can get the following info through 'ps':

root:~# ps -aux | grep drop
root 357956 ... R Aug25 21119854:55 echo 3 > /proc/sys/vm/drop_caches
root 1771385 ... R Aug16 21146421:17 echo 3 > /proc/sys/vm/drop_caches
root 1986319 ... R 18:56 117:27 echo 3 > /proc/sys/vm/drop_caches
root 2002148 ... R Aug24 5720:39 echo 3 > /proc/sys/vm/drop_caches
root 2564666 ... R 18:59 113:58 echo 3 > /proc/sys/vm/drop_caches
root 2639347 ... R Sep03 2383:39 echo 3 > /proc/sys/vm/drop_caches
root 3904747 ... R 03:35 993:31 echo 3 > /proc/sys/vm/drop_caches
root 4016780 ... R Aug21 7882:18 echo 3 > /proc/sys/vm/drop_caches

Use bpftrace follow 'freed' value in drop_slab_node:

root:~# bpftrace -e 'kprobe:drop_slab_node+70 {@ret=hist(reg("bp")); }'
Attaching 1 probe...
^B^C

@ret:
[64, 128) 1 | |
[128, 256) 28 | |
[256, 512) 107 |@ |
[512, 1K) 298 |@@@ |
[1K, 2K) 613 |@@@@@@@ |
[2K, 4K) 4435 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[4K, 8K) 442 |@@@@@ |
[8K, 16K) 299 |@@@ |
[16K, 32K) 100 |@ |
[32K, 64K) 139 |@ |
[64K, 128K) 56 | |
[128K, 256K) 26 | |
[256K, 512K) 2 | |

In one drop caches action, only traverse memcg once maybe is better.
If user need more memory, they can do drop caches again.

Signed-off-by: Chunxin Zang <zangchunxin@xxxxxxxxxxxxx>
Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx>
---
mm/vmscan.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index b6d84326bdf2..9d8ee2ae5824 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -699,17 +699,12 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,

void drop_slab_node(int nid)
{
- unsigned long freed;
+ struct mem_cgroup *memcg = NULL;

+ memcg = mem_cgroup_iter(NULL, NULL, NULL);
do {
- struct mem_cgroup *memcg = NULL;
-
- freed = 0;
- memcg = mem_cgroup_iter(NULL, NULL, NULL);
- do {
- freed += shrink_slab(GFP_KERNEL, nid, memcg, 0);
- } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL);
- } while (freed > 10);
+ shrink_slab(GFP_KERNEL, nid, memcg, 0);
+ } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL);
}

void drop_slab(void)
--
2.11.0