Re: 5.0-rc kernel hangs on early boot
From: Will Deacon
Date: Wed Feb 13 2019 - 06:25:29 EST
On Wed, Feb 13, 2019 at 11:21:41AM +0000, Mel Gorman wrote:
> On Wed, Feb 13, 2019 at 11:18:44AM +0000, Will Deacon wrote:
> > On Wed, Feb 13, 2019 at 11:25:40AM +0300, Yury Norov wrote:
> > > My kernel on qemu/arm64 setup hangs at early boot since v5.0-rc1.
> > > Backtrace is not too verbose:
> > > (gdb) i threads
> > > Id Target Id Frame
> > > * 1 Thread 1 (CPU#0 [running]) 0xffff000010a49b74 in __delay (cycles=4096)
> > > at arch/arm64/lib/delay.c:49
> > > 2 Thread 2 (CPU#1 [halted ]) 0x0000000000000000 in ?? ()
> > > 3 Thread 3 (CPU#2 [halted ]) 0x0000000000000000 in ?? ()
> > > 4 Thread 4 (CPU#3 [halted ]) 0x0000000000000000 in ?? ()
> > > (gdb) bt
> > > #0 0xffff000010a49b74 in __delay (cycles=4096) at arch/arm64/lib/delay.c:49
> > > Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> > >
> > > Reverting the patch
> > > 1c30844d2dfe272d58c ("mm: reclaim small amounts of memory when an external
> > > fragmentation event occurs") together with following patch
> > > 73444bc4d8f92e46a20 ("mm, page_alloc: do not wake kswapd with zone lock held")
> > > helps me to boot normally.
> > >
> > > Some system information is below, and config is attached.
> > FWIW, running with your command-line and .config under KVM with earlycon
> > leads to an early page allocation failure followed by a NULL dereference
> > during boot if only 1G is configured (log below). For the mm folks, it's
> > probably worth pointing out that you're using 64k pages.
> Thanks Will.
> While I agree that going OOM early is a problem and would explain why
> the boosting logic was hit at all, it's still the case that the boosting
> should not divide by zero. Even if the booting is broken due to a lack
> of memory, I'd still not prefer to crash due to 1c30844d2dfe272d58c.
Yup, sorry, our previous mails crossed paths. Your patch looks sensible in
its own right, I'm just left wondering why we're OOM so early during boot!