Re: 5.1 kernel: khugepaged stuck at 100%

From: Justin Piszcz
Date: Thu May 16 2019 - 10:20:03 EST


On Thu, May 16, 2019 at 10:14 AM Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx> wrote:
>
> On Tue, May 14, 2019 at 9:16 AM Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote:
>
> > Could you check what khugepaged doing?
> >
> > cat /proc/$(pidof khugepaged)/stack
>
> It is doing it again, 10:12am - 2019-05-16
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 77 root 39 19 0 0 0 R 100.0 0.0 92:06.94 khugepaged
>
> Kernel: 5.1.2
>
> $ sudo cat /proc/$(pidof khugepaged)/stack
> [<0>] 0xffffffffffffffff
>
> $ perf top
>
> PerfTop: 3716 irqs/sec kernel:92.9% exact: 99.1% lost: 68/68
> drop: 0/0 [4000Hz cycles], (all, 12 CPUs)
> -------------------------------------------------------------------------------
>
> 47.53% [kernel] [k] compaction_alloc
> 38.88% [kernel] [k] __pageblock_pfn_to_page
> 6.68% [kernel] [k] nmi
> 0.58% [kernel] [k] __list_del_entry_valid
> 0.48% [kernel] [k] format_decode
> 0.39% [kernel] [k] __rb_insert_augmented
> 0.25% libdbus-1.so.3.19.9 [.] _dbus_string_hex_decode
> 0.24% [kernel] [k] entry_SYSCALL_64_after_hwframe
> 0.20% perf [.] rb_next
> 0.19% perf [.] __symbols__insert

Incase this is asked for later:

# find . -maxdepth 1 -type f | while read f; do echo "$f
---------------------"; cat $f; echo " "; done
./environ ---------------------

./auxv ---------------------

./status ---------------------
Name: khugepaged
Umask: 0000
State: R (running)
Tgid: 77
Ngid: 0
Pid: 77
PPid: 2
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 64
Groups:
NStgid: 77
NSpid: 77
NSpgid: 0
NSsid: 0
Threads: 1
SigQ: 1/257125
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: ffffffffffffffff
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: fff
Cpus_allowed_list: 0-11
Mems_allowed: 1
Mems_allowed_list: 0
voluntary_ctxt_switches: 7728
nonvoluntary_ctxt_switches: 94662

./personality ---------------------
00000000

./limits ---------------------
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 257125 257125 processes
Max open files 1024 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 257125 257125 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us

./sched ---------------------
khugepaged (77, #threads: 1)
-------------------------------------------------------------------
se.exec_start : 84565632.684623
se.vruntime : 30287808.462012
se.sum_exec_runtime : 5808238.978150
se.nr_migrations : 946
nr_switches : 102390
nr_voluntary_switches : 7728
nr_involuntary_switches : 94662
se.load.weight : 15360
se.runnable_weight : 15360
se.avg.load_sum : 47454
se.avg.runnable_load_sum : 47454
se.avg.util_sum : 48835831
se.avg.load_avg : 14
se.avg.runnable_load_avg : 14
se.avg.util_avg : 1024
se.avg.last_update_time : 84565632684032
se.avg.util_est.ewma : 25
se.avg.util_est.enqueued : 16
policy : 0
prio : 139
clock-delta : 11

./comm ---------------------
khugepaged

./syscall ---------------------
running

./cmdline ---------------------

./stat ---------------------
77 (khugepaged) R 2 0 0 0 -1 27265088 0 0 0 0 0 580824 0 0 39 19 1 0
725 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 0 0 0 17 4 0 0
0 0 0 0 0 0 0 0 0 0 0

./statm ---------------------
0 0 0 0 0 0 0

./maps ---------------------

./mem ---------------------

./clear_refs ---------------------
cat: ./clear_refs: Invalid argument

./smaps ---------------------

./smaps_rollup ---------------------
cat: ./smaps_rollup: No such process

./pagemap ---------------------

./wchan ---------------------
0
./stack ---------------------
[<0>] 0xffffffffffffffff

./schedstat ---------------------
5808249976150 707215957352 102392

./latency ---------------------
Latency Top version : v0.1

./cpuset ---------------------
/

./cgroup ---------------------
11:perf_event:/
10:cpuset:/
9:devices:/
8:rdma:/
7:cpu,cpuacct:/
6:blkio:/
5:freezer:/
4:net_cls:/
3:memory:/
2:pids:/
1:name=systemd:/
0::/


./oom_score ---------------------
0

./oom_adj ---------------------
0

./oom_score_adj ---------------------
0

./coredump_filter ---------------------

./io ---------------------
rchar: 0
wchar: 0
syscr: 0
syscw: 0
read_bytes: 0
write_bytes: 0
cancelled_write_bytes: 0

./uid_map ---------------------
0 0 4294967295

./gid_map ---------------------
0 0 4294967295

./projid_map ---------------------
0 0 4294967295

./setgroups ---------------------
allow

./timerslack_ns ---------------------
50000

./stack_depth ---------------------
previous stack depth: 16376
stack depth: 16376