Re: linux-next crash during very early boot
From: Joonsoo Kim
Date: Wed Apr 20 2016 - 04:10:44 EST
On Fri, Apr 15, 2016 at 10:10:33AM -0400, Valdis.Kletnieks@xxxxxx wrote:
> On Thu, 14 Apr 2016 10:35:47 +0900, Joonsoo Kim said:
> > On Wed, Apr 13, 2016 at 08:29:46PM -0400, Valdis Kletnieks wrote:
> > > I'm seeing my laptop crash/wedge up/something during very early
> > > boot - before it can write anything to the console. Nothing in pstore,
> > > need to hold down the power button for 6 seconds and reboot.
> > >
> > > git bisect points at:
> > >
> > > commit 7a6bacb133752beacb76775797fd550417e9d3a2
> > > Author: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> > > Date: Thu Apr 7 13:59:39 2016 +1000
> > >
> > > mm/slab: factor out kmem_cache_node initialization code
> > >
> > > It can be reused on other place, so factor out it. Following patch wil
> l
> > > use it.
> > >
> > >
> > > Not sure what the problem is - the logic *looks* ok at first read. The
> > > patch *does* remove a spin_lock_irq() - but I find it difficult to
> > > believe that with it gone, my laptop is able to hit the race condition
> > > the spinlock protects against *every single boot*.
> > >
> > > The only other thing I see is that n->free_limit used to be assigned
> > > every time, and now it's only assigned at initial creation.
> >
> > Hello,
> >
> > My fault. It should be assgined every time. Please test below patch.
> > I will send it with proper SOB after you confirm the problem disappear.
> > Thanks for report and analysis!
>
> Following up - I verified that it was your patch series and not a bad bisect
> by starting with a clean next-20160413 and reverting that series - and the
> resulting kernel boots fine.
>
> Will take a closer look at your fix patch and figure out what's still changed
> afterwards - there's obviously some small semantic change that actually
> matters, but we're not spotting it yet...
Hello,
Do you try to test the patch in following link on top of my fix for "mm/slab:
factor out kmem_cache_node initialization code"?
https://lkml.org/lkml/2016/4/10/703
I mentioned it in another thread but you didn't reply it so I'm
curious.
Thanks.