Re: system keeps freezing once every 24 hours / random apps crashing

From: Mark v Wolher
Date: Fri Dec 30 2005 - 10:57:24 EST




Jesper Juhl wrote:
> On 12/30/05, Trilight <trilight@xxxxxxxxx> wrote:
>
>>Hiya,
>>
>>I'm using the 2.6.14.5 kernel and i notice that the system freezes
>>sometimes, within 24 hours usually, a total freeze, no mouse/keyb
>>reaction. Also i notice that apps crash randomly sometimes.
>>
>
> When did this start to happen? Was it OK with a previous kernel
> version? if it was ok with a previous version, then what was that
> version?
> Was it OK before you added a particular piece of hardware? If so, what
> hardware? Have you tried removing that hardware to see if the problem
> goes away?
>
>
>>What can i do to investigate this ?
>>
>
> A few things you can try :
>
> 1) Start by providing some more info. Some details on your
> hardware/software. Something like the following + whatever else you
> consider relevant :
> - name and version of your Linux distribution
> - output of the scripts/ver_linux script found in the kernel source
> - your kernels .config file
> - full dmesg output after boot
> - Motherboard name/model
> - output of cat /proc/cpuinfo
> - output of cat /proc/meminfo
> - output of lspci -vv
> - output of lsusb
>
> 2) Tell us what you have already tried in order to try and resolve the
> problem, including your results with the various things you've tried.
>
> 3) Try building/running a kernel with the various debug options found
> in the kernel hacking section turned on and see if that results in
> more details in dmesg/logs etc and provide the extra info if any.
>
> 4) Try building a 2.6.15-rc7-git4 kernel with the same config and see
> if that one also has problems.
>
> Make sure your hardware is OK, CPU not overheating, RAM is OK (run
> memtest86 with all tests enabled overnight) etc.
>
> Try removing all extra hardware components in your system you don't
> need for the system to boot and see if the problem then goes away. If
> it does, try adding back hardware one piece at a time and re-test,
> find out if it's related to a certain piece of hardware or a specific
> driver.

<..>

Thanks for the advise !

About the memory test, i did that, 7 full passes, no errors, it's 512mb
ecc memory btw. I'm going to let it, when i go to sleep, run the whole
night.

hardware:

System is a dell precision workstation 650, dual xeon 2.4ghz w/HT, intel
E7505 motherboard.

distro: debian sarge
kernel: vanilla 2.6.14.5

for the rest there is nothing special to see in dmesg output, lspci or
with lsusb. cpuinfo shows everything what it should show.

Memoinfo:

MemTotal: 512528 kB
MemFree: 8760 kB
Buffers: 2656 kB
Cached: 236216 kB
SwapCached: 2052 kB
Active: 390480 kB
Inactive: 54756 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 512528 kB
LowFree: 8760 kB
SwapTotal: 4883680 kB
SwapFree: 4864064 kB
Dirty: 112 kB
Writeback: 0 kB
Mapped: 388988 kB
Slab: 23320 kB
CommitLimit: 5139944 kB
Committed_AS: 518952 kB
PageTables: 1912 kB
VmallocTotal: 515796 kB
VmallocUsed: 25496 kB
VmallocChunk: 487120 kB


Other findings;

- all kernels had the same issue, except (not 100 % sure) 2.4.2X kernels
- tried acpi=noirq without success and many many other acpi options &
combo's
- nvidia binary driver replaced by kernel nv driver but without success

I have no reason to suspect the tvcard which is a terratec value with a
bt878 chip, support in the kernel. But on the other hand it could be the
tvcard, but i see no relation to anything with it. I tried also using
DAC snoop in the bios but no good.

None of the issue's occur when using windows xp pro/rhel enterprise 4

I'm going to let the memory test on for the whole night, i'll also
compile the kernel with debugging options on. But i don't think the
debugging options will matter since nothing is logged when the freeze
occurs.

Greetz,

Mark






-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/