Re: [pgtable_trans_huge_withdraw] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020

From: Fengguang Wu
Date: Mon Oct 30 2017 - 05:28:55 EST


Hi Kirill,

On Mon, Oct 30, 2017 at 12:19:40PM +0300, Kirill A. Shutemov wrote:
On Mon, Oct 30, 2017 at 12:37:01AM +0100, Fengguang Wu wrote:
CC MM people.

On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
> Hi Linus,
>
> Up to now we see the below boot error/warnings when testing v4.14-rc6.
>
> They hit the RC release mainly due to various imperfections in 0day's
> auto bisection. So I manually list them here and CC the likely easy to
> debug ones to the corresponding maintainers in the followup emails.
>
> boot_successes: 4700
> boot_failures: 247
>
> BUG:kernel_hang_in_test_stage: 152
> BUG:kernel_reboot-without-warning_in_test_stage: 10
> BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c: 1
> BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c: 3
> BUG:sleeping_function_called_from_invalid_context_at_mm/page_alloc.c: 21
> BUG:soft_lockup-CPU##stuck_for#s: 1
> BUG:unable_to_handle_kernel: 13

Here is the call trace:

[ 956.669197] [ 956.670421] stress-ng: fail: [27945] stress-ng-numa:
get_mempolicy: errno=22 (Invalid argument)

Can you also share how you run stress-ng? Is it reproducible?

The command line is

stress-ng --class cpu --sequential $(nproc) --timeout 1 --times --verify --metrics-brief

The test box is

model: Broadwell-EP
nr_cpu: 88
memory: 128G

It shows up 4 times in 6 test runs:

/result/stress-ng/60s-cpu-performance/lkp-bdw-ep6/debian-x86_64-2016-08-31.cgz/x86_64-rhel-7.2/gcc-6/bb176f67090ca54869fc1262c913aa69d2ede070/matrix.json

"dmesg.BUG:unable_to_handle_kernel": [
0,
1,
1,
1,
0,
1
],

Thanks,
Fengguang