Re: [PATCH 3/8] mm/vmscan: Throttle reclaim when no progress is being made

From: Mike Galbraith
Date: Wed Nov 24 2021 - 12:25:01 EST


On Wed, 2021-11-24 at 10:32 +0000, Mel Gorman wrote:
> On Tue, Nov 23, 2021 at 05:19:12PM -0800, Darrick J. Wong wrote:
>
> > AFAICT the system is mostly idle, but it's difficult to tell because ps
> > and top also get stuck waiting for this cgroup for whatever reason.
>
> But this is surprising because I expect that ps and top are not running
> within the cgroup. Was /proc/PID/stack readable?

Probably this.

crash> ps | grep UN
4418 4417 4 ffff8881cae66e40 UN 0.0 7620 980 memcg_test_1 <== the bad guy
4419 4417 6 ffff8881cae62f40 UN 0.0 7620 980 memcg_test_1
4420 4417 5 ffff8881cae65e80 UN 0.0 7620 980 memcg_test_1
4421 4417 7 ffff8881cae63f00 UN 0.0 7620 980 memcg_test_1
4422 4417 4 ffff8881cae60000 UN 0.0 7620 980 memcg_test_1
4423 4417 3 ffff888128985e80 UN 0.0 7620 980 memcg_test_1
4424 4417 7 ffff888117f79f80 UN 0.0 7620 980 memcg_test_1
4425 4417 2 ffff888117f7af40 UN 0.0 7620 980 memcg_test_1
4428 2791 6 ffff8881a8253f00 UN 0.0 38868 3568 ps
4429 2808 4 ffff888100c90000 UN 0.0 38868 3600 ps
crash> bt -sx 4429
PID: 4429 TASK: ffff888100c90000 CPU: 4 COMMAND: "ps"
#0 [ffff8881af1c3ce0] __schedule+0x285 at ffffffff817ae6c5
#1 [ffff8881af1c3d68] schedule+0x3a at ffffffff817aed4a
#2 [ffff8881af1c3d78] rwsem_down_read_slowpath+0x197 at ffffffff817b11a7
#3 [ffff8881af1c3e08] down_read_killable+0x5c at ffffffff817b142c
#4 [ffff8881af1c3e18] down_read_killable+0x5c at ffffffff817b142c
#5 [ffff8881af1c3e28] __access_remote_vm+0x3f at ffffffff8120131f
#6 [ffff8881af1c3e90] proc_pid_cmdline_read+0x148 at ffffffff812fc9a8
#7 [ffff8881af1c3ee8] vfs_read+0x92 at ffffffff8126a302
#8 [ffff8881af1c3f00] ksys_read+0x7d at ffffffff8126a72d
#9 [ffff8881af1c3f38] do_syscall_64+0x37 at ffffffff817a3f57
#10 [ffff8881af1c3f50] entry_SYSCALL_64_after_hwframe+0x44 at ffffffff8180007c
RIP: 00007f4b50fe8b5e RSP: 00007ffdd7f6fe38 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 00007f4b5186a010 RCX: 00007f4b50fe8b5e
RDX: 0000000000020000 RSI: 00007f4b5186a010 RDI: 0000000000000006
RBP: 0000000000020000 R8: 0000000000000007 R9: 00000000ffffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f4b5186a010
R13: 0000000000000000 R14: 0000000000000006 R15: 0000000000000000
ORIG_RAX: 0000000000000000 CS: 0033 SS: 002b
crash> mm_struct -x ffff8881021b4800
struct mm_struct {
{
mmap = 0xffff8881ccfe6a80,
mm_rb = {
rb_node = 0xffff8881ccfe61a0
},
...
mmap_lock = {
count = {
counter = 0x3
},
owner = {
counter = 0xffff8881cae66e40
...
crash> bt 0xffff8881cae66e40
PID: 4418 TASK: ffff8881cae66e40 CPU: 4 COMMAND: "memcg_test_1"
#0 [ffff888154097a88] __schedule at ffffffff817ae6c5
#1 [ffff888154097b10] schedule at ffffffff817aed4a
#2 [ffff888154097b20] schedule_timeout at ffffffff817b311f
#3 [ffff888154097b90] reclaim_throttle at ffffffff811d802b
#4 [ffff888154097bf0] do_try_to_free_pages at ffffffff811da206
#5 [ffff888154097c40] try_to_free_mem_cgroup_pages at ffffffff811db522
#6 [ffff888154097cd0] try_charge_memcg at ffffffff81256440
#7 [ffff888154097d60] obj_cgroup_charge_pages at ffffffff81256c97
#8 [ffff888154097d88] obj_cgroup_charge at ffffffff8125898c
#9 [ffff888154097da8] kmem_cache_alloc at ffffffff81242099
#10 [ffff888154097de0] vm_area_alloc at ffffffff8106c87a
#11 [ffff888154097df0] mmap_region at ffffffff812082b2
#12 [ffff888154097e58] do_mmap at ffffffff81208922
#13 [ffff888154097eb0] vm_mmap_pgoff at ffffffff811e259f
#14 [ffff888154097f38] do_syscall_64 at ffffffff817a3f57
#15 [ffff888154097f50] entry_SYSCALL_64_after_hwframe at
ffffffff8180007c
RIP: 00007f211c36b743 RSP: 00007ffeaac1bd58 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f211c36b743
RDX: 0000000000000003 RSI: 0000000000001000 RDI: 0000000000000000
RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000002022 R11: 0000000000000246 R12: 0000000000000003
R13: 0000000000001000 R14: 0000000000002022 R15: 0000000000000000
ORIG_RAX: 0000000000000009 CS: 0033 SS: 002b
crash>