LTP hugemmap05 test case failure on arm64 with linux-next (next-20190613)

From: Qian Cai
Date: Thu Jun 13 2019 - 17:38:48 EST


LTP hugemmap05 test case [1] could not exit itself properly and then degrade the
system performance on arm64 with linux-next (next-20190613). The bisection so
far indicates,

BAD: 30bafbc357f1 Merge remote-tracking branch 'arm64/for-next/core'
GOOD: 0c3d124a3043 Merge remote-tracking branch 'arm64-fixes/for-next/fixes'

I don't see anything obvious between those two pull requests, so I guess
something in 'arm64/for-next/core' is wrong.

$ git log --oneline 361413ee1992..9b6047220590
9b6047220590 arm64: mm: avoid redundant READ_ONCE(*ptep)
4745224b4509 arm64/mm: Refactor __do_page_fault()
c49bd02f4c74 arm64/mm: Document write abort detection from ESR
8e01076afd97 arm64: Fix comment after #endif
f086f67485c5 arm64: ptrace: add support for syscall emulation
fd3866381be2 arm64: add PTRACE_SYSEMU{,SINGLESTEP} definations to uapi headers
15532fd6f57c ptrace: move clearing of TIF_SYSCALL_EMU flag to core
616810360043 arm64/mm: Drop task_struct argument from __do_page_fault()
a0509313d5de arm64/mm: Drop mmap_sem before calling __do_kernel_fault()
01de1776f62e arm64/mm: Identify user instruction aborts
87dedf7c61ab arm64/mm: Change BUG_ON() to VM_BUG_ON() in [pmd|pud]_set_huge()
2e6aee5af330 arm64: kernel: use aff3 instead of aff2 in comment
27e6e7d63fc2 arm64/cpufeature: Convert hook_lock to raw_spin_lock_t in
cpu_enable_ssbs()
0c1f14ed1226 arm64: mm: make CONFIG_ZONE_DMA32 configurable
f7f0097af67c arm64/mm: Simplify protection flag creation for kernel huge
mappings
7b8c87b297a7 arm64: cacheinfo: Update cache_line_size detected from DT or PPTT
9a83c84c3a49 drivers: base: cacheinfo: Add variable to record max cache line
size
6dcdefcde413 arm64/fpsimd: Don't disable softirq when touching FPSIMD/SVE state
54b8c7cbc57c arm64/fpsimd: Introduce fpsimd_save_and_flush_cpu_state() and use
it
6fa9b41f6f15 arm64/fpsimd: Remove the prototype for sve_flush_cpu_state()
201d355c15c1 arm64/mm: Move PTE_VALID from SW defined to HW page table entry
definitions
441a62780687 arm64/hugetlb: Use macros for contiguous huge page sizes

[1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/h
ugetlb/hugemmap/hugemmap05.c

# /opt/ltp/testcases/bin/hugemmap05 -s -m
tst_test.c:1111: INFO: Timeout per run is 0h 05m 00s
hugemmap05.c:235: INFO: original nr_hugepages is 0
hugemmap05.c:248: INFO: original nr_overcommit_hugepages is 0
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Test timeouted, sending SIGKILL!
Cannot kill test processes!
Congratulation, likely test hit a kernel bug.
Exitting uncleanly...

[ 7792.681691][ T5025] LTP: starting hugemmap05_3 (hugemmap05 -s -m)
[ 7911.149058][ T1309] INFO: task hugemmap05:51035 can't die for more than 122
seconds.
[ 7911.156833][ T1309] hugemmap05ÂÂÂÂÂÂRÂÂrunning taskÂÂÂÂ27648 51035ÂÂÂÂÂÂ1
0x0000000d
[ 7911.164654][ T1309] Call trace:
[ 7911.167823][ T1309]ÂÂ__switch_to+0x2e0/0x37c
[ 7911.172128][ T1309]ÂÂ0x3e4ca
[ 7911.175033][ T1309]Â
[ 7911.175033][ T1309] Showing all locks held in the system:
[ 7911.182888][ T1309] 1 lock held by khungtaskd/1309:
[ 7911.187778][ T1309]ÂÂ#0: 0000000037a3e572 (rcu_read_lock){....}, at:
rcu_lock_acquire+0x8/0x38
[ 7911.196655][ T1309] 4 locks held by hugemmap05/51035:
[ 7911.201731][ T1309] 4 locks held by hugemmap05/51038:
[ 7911.206814][ T1309]Â
[ 7911.209025][ T1309] =============================================
[ 7911.209025][ T1309]Â