Re: kernel-4.9.270 crash
From: wim
Date: Wed Sep 08 2021 - 20:56:38 EST
On Wed, Sep 08, 2021 at 07:30:49AM +0200, Greg KH wrote:
> > > > ...
> > > > Aug 1 20:51:24 djo kernel: [<f8bc4ef7>] ? 0xf8bc4ef7
> > >
> > > <snip>
> > >
> > > These aren't going to help us much, can you turn on debugging symbols
> > > for these crashes for us to see the symbol names?
> >
> > ERROR: not enough memory to load nouveau.ko
>
> That's the only error? Maybe you don't have enough memory?
Nouveau.ko with symbols is really huge. I see only 2GB RAM in that machine,
so I'm not amazed.
> > i915.ko is smaller and my laptop is bigger. Identical crash, no symbols.
>
> Odd.
I've had that before, some years ago. The devs were very reluctant to start
investigating. After a while the bug just vanished. Bugs come and go was
their remark.
This time the bug doesn't vanish spontaneously.
> > > > > Can you use 'git bisect' to track down the offending commit?
> > and that brought me reasonably fast to this:
> >
> > 3bd3a8ca5a7b1530f463b6e1cc811c085e6ffa01 is the first bad commit
> > commit 3bd3a8ca5a7b1530f463b6e1cc811c085e6ffa01
> > Author: Maciej W. Rozycki <macro@xxxxxxxxxxx>
> > Date: Thu May 13 11:51:50 2021 +0200
> > ...
>
> That is a vt change that handles an issue with a console driver, so this
> feels like a false failure.
>
> If you revert this change on a newer kernel release, does it work?
No false failure.
git checkout v4.9.282
git revert <the above patch>
Lo and behold, no crash on modprobe i915 !!!
> And what about showing us the symbols of that traceback?
What symbols of what traceback? It does not crash!
And when it crashes (the previous case) there are no symbols, despite
debugging set to on. Just the same log. Apparently it ran invalid code.
What does the 'Divide Error: 0000' mean? A divide by zero error?
Regards, Wim.