Re: [regression -next0117] What is kcompactd and why is he eating 100% of my cpu?

From: Mel Gorman
Date: Sun Jan 27 2019 - 09:16:14 EST


On Sat, Jan 26, 2019 at 09:56:53PM -0500, valdis.kletnieks@xxxxxx wrote:
> On Sat, 26 Jan 2019 21:00:05 +0100, Pavel Machek said:
>
> > top - 13:38:51 up 1:42, 16 users, load average: 1.41, 1.93, 1.62
> > Tasks: 182 total, 3 running, 138 sleeping, 0 stopped, 0 zombie
> > %Cpu(s): 2.3 us, 57.8 sy, 0.0 ni, 39.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > KiB Mem: 3020044 total, 2429420 used, 590624 free, 27468 buffers
> > KiB Swap: 2097148 total, 0 used, 2097148 free. 1924268 cached Mem
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 608 root 20 0 0 0 0 R 99.6 0.0 11:34.38 kcompactd0
> > 9782 root 20 0 0 0 0 I 7.9 0.0 0:59.02 kworker/0:
> > 2971 root 20 0 46624 23076 13576 S 4.3 0.8 2:50.22 Xorg
>
> I've noticed this as well on earlier kernels (next-20181224 to 20190115)
>
> Some more info:
>
> 1) echo 3 > /proc/sys/vm/drop_caches unwedges kcompactd in 1-3 seconds.
>

This aspect is curious as it indicates that kcompactd could potentially
be infinite looping but it's not something I've experienced myself. By
any chance is there a preditable reproduction case for this?

> I've also seen khugepaged hung up:
>
> cat /proc/29/stack
> [<0>] ___preempt_schedule+0x16/0x18
> [<0>] page_vma_mapped_walk+0x60/0x840
> [<0>] remove_migration_pte+0x67/0x390
> [<0>] rmap_walk_file+0x186/0x380
> [<0>] rmap_walk+0xa3/0xd0
> [<0>] remove_migration_ptes+0x69/0x70
> [<0>] migrate_pages+0xb6d/0xfd8
> [<0>] compact_zone+0xb70/0x1370
> [<0>] compact_zone_order+0xd8/0x120
> [<0>] try_to_compact_pages+0xe5/0x550
> [<0>] __alloc_pages_direct_compact+0x6d/0x1a0
> [<0>] __alloc_pages_slowpath+0x6c9/0x1640
> [<0>] __alloc_pages_nodemask+0x558/0x5b0
> [<0>] khugepaged+0x499/0x810
> [<0>] kthread+0x158/0x170
> [<0>] ret_from_fork+0x3a/0x50
> [<0>] 0xffffffffffffffff
>
> Looks like something has gone astray with compact_zone.
>

It's a possibility that the buffer aspect of the trace is a red herring
and there is some corner case that prevents the migration scan/free
scanner meeting and exiting compaction. Again, a reproduction case of
some sort would be nice or an indication of how long it takes to
trigger. An update of the series is due which may or may not fix this
but if it doesn't, we'll need to start tracing this to see what's going
on at the point of failure.

--
Mel Gorman
SUSE Labs