RE: kernel panics with 4.14.X versions

From: Dexuan Cui
Date: Fri Apr 20 2018 - 13:43:33 EST


> From: Jan Kara <jack@xxxxxxx>
> Sent: Friday, April 20, 2018 03:22
> On Thu 19-04-18 21:37:25, Dexuan Cui wrote:
> > > From: Jan Kara
> > > Sent: Thursday, April 19, 2018 13:23
> > > Good news guys, Robert has just spotted a bug which looks like what I'd
> > > expect can cause your lockups / crashes. I've merged his patch to my tree
> > > and will push it to Linus for -rc3 so eventually it should land in
> > > appropriate stable trees as well. If you are too eager to test it out, it
> > > is attached for you to try.
> > >
> > > Jan Kara
> >
> > The patch's changelog says "... this behavior results in a kernel panic."
> > This sounds like a reference to corrupt memory causes a page fault or
> > general protection fault.
> >
> > But what I saw is only a lockup rather than a kernel panic:
> > watchdog: BUG: soft lockup - CPU#10 stuck for 22s! [java:87260]"
> >
> > So I guess what I saw can be a different unresolved issue?
>
> Actually I don't think so. The list iteration simply went through stray
> pointer. That can crash but it can also end in an infinite loop, or it can
> just randomly corrupt memory. I've seen all these situations with similar
> problems. So the fix is definitely worth trying.
>
> Jan Kara

Thanks for the explanation! It sounds promising!

We haven't been able to reproduce the issue by ourselves.
If our customer still keeps the setup to reproduce the issue, we'll try to
test the patch.

-- Dexuan