Re: system keeps freezing once every 24 hours / random apps crashing

From: Jesper Juhl
Date: Sat Dec 31 2005 - 06:28:15 EST


On 12/31/05, Mark v Wolher <trilight@xxxxxxxxx> wrote:
> Alistair John Strachan wrote:
> > On Saturday 31 December 2005 00:20, Mark v Wolher wrote:
> > [snip]
> >
> >>>This is good news -- you stand a better chance of achieving the stability
> >>>you require by eliminating variables. VMWare and NVIDIA are useful
> >>>softwares, and I would not deny that, but they are closed source and thus
> >>>any conflicts resulting from their use are not necessary LKML material
> >>>(however, if the interaction is generic and is as a result of a kernel
> >>>bug, then the maintainer would very much like to hear it).
> >>
> >>Okay, i have something interesting now, i only had the nvidia module
> >>loaded so my x-configuration starts up as usual. (not saying the nvidia
> >>module is flawless, i'm sure it still contains bugs)
> >>But here is the crash info, this time it was mozilla, i think this
> >>speaks more hehe :
> >>
The fact that it happens to be mozilla that crashes this time says
*nothing at all* as long as you have binary only modules loaded that
may have messed up the kernel without any way for us to debug that.


> >>Dec 31 00:55:28 localhost kernel: mm/memory.c:106: bad pgd 061f0c08.
> >>Dec 31 00:55:28 localhost kernel: mm/memory.c:106: bad pgd 06b96000.
> >>Dec 31 00:55:28 localhost kernel: mm/memory.c:106: bad pgd 18000bf8.
> >>Dec 31 00:55:28 localhost kernel: ------------[ cut here ]------------
> >>Dec 31 00:55:28 localhost kernel: kernel BUG at mm/mmap.c:2214!
> >>Dec 31 00:55:28 localhost kernel: invalid operand: 0000 [#1]
> >>Dec 31 00:55:28 localhost kernel: SMP
> >>Dec 31 00:55:28 localhost kernel: Modules linked in: nvidia
> >
> >
> > Steady and sure progress. Now, the trace below doesn't explicitly mention any
> > nvidia symbols, but this line must disappear before anybody will bother to
> > read your report.
> >
> > Remove the module. This does not mean unload, this means "never load in the
> > first place". Then reproduce the problem. If you are successful, send a new

Agreed. As long as nvidia or vmware binary only modules have even been
loaded once, what state the kernel is in is a complete unknown.
To be useful, all testing you do *must* happen without both the nvidia
and vmware modules ever having been loaded. As soon as you load one of
them for even a second any further testing becomes irrelevant.


> > email (not pinned to this thread) with a subject a la "kernel BUG at
> > mm/mmap.c:2214". State that the kernel is not tainted.
> >
> > At this point all you can do is wait. Good luck!
> >
>
> Well, i guess i'll have to do that to be sure. But i must say that i did
> try the nv module and de-installed the nvidia binary module. It didn't
> matter, the system froze but didn't leave anything in the logs, this
> time it did. Doesn't that help at all ?
>
Not really, since anything it leaves in the logs may have been caused
by the binary only module(s), but we have no way to find out, so the
info is next to useless as long as binary only modules are loaded - it
may be correct or it may be wrong, but we have no way to know.


--
Jesper Juhl <jesper.juhl@xxxxxxxxx>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/