Re: [pgtable_trans_huge_withdraw] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020

From: Kirill A. Shutemov
Date: Mon Oct 30 2017 - 07:28:03 EST


On Mon, Oct 30, 2017 at 10:28:42AM +0100, Fengguang Wu wrote:
> Hi Kirill,
>
> On Mon, Oct 30, 2017 at 12:19:40PM +0300, Kirill A. Shutemov wrote:
> > On Mon, Oct 30, 2017 at 12:37:01AM +0100, Fengguang Wu wrote:
> > > CC MM people.
> > >
> > > On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
> > > > Hi Linus,
> > > >
> > > > Up to now we see the below boot error/warnings when testing v4.14-rc6.
> > > >
> > > > They hit the RC release mainly due to various imperfections in 0day's
> > > > auto bisection. So I manually list them here and CC the likely easy to
> > > > debug ones to the corresponding maintainers in the followup emails.
> > > >
> > > > boot_successes: 4700
> > > > boot_failures: 247
> > > >
> > > > BUG:kernel_hang_in_test_stage: 152
> > > > BUG:kernel_reboot-without-warning_in_test_stage: 10
> > > > BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c: 1
> > > > BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/rwsem.c: 3
> > > > BUG:sleeping_function_called_from_invalid_context_at_mm/page_alloc.c: 21
> > > > BUG:soft_lockup-CPU##stuck_for#s: 1
> > > > BUG:unable_to_handle_kernel: 13
> > >
> > > Here is the call trace:
> > >
> > > [ 956.669197] [ 956.670421] stress-ng: fail: [27945] stress-ng-numa:
> > > get_mempolicy: errno=22 (Invalid argument)
> >
> > Can you also share how you run stress-ng? Is it reproducible?
>
> The command line is
>
> stress-ng --class cpu --sequential $(nproc) --timeout 1 --times --verify --metrics-brief
>
> The test box is
>
> model: Broadwell-EP
> nr_cpu: 88
> memory: 128G

By chance, do you emulated nvdimm there? I suspect DAX stuff.
Do you have full dmesg around?

--
Kirill A. Shutemov