Re: System hangs (after some hours of working) since kernel 2.6.22.19
From: Pekka Enberg
Date: Fri Feb 06 2009 - 05:51:23 EST
Hi Alexander,
On Fri, Feb 6, 2009 at 11:06 AM, Alexander Naumann
<A.Naumann@xxxxxxxxxxx> wrote:
> Summary of this bug ist hat system stops working after a couple of hours (between 30 minutes and 6 hours).
> There are no messages at all â
> It just stops, I even cannot use some keys like SysRq (this ALT-PrintScreen thing).
>
> The only thing I am running is tiobench (to force system halt):
> âtiobench --numruns 10000â
> If I am running nothing, the system does not stop working or it takres verly long until it hangs.
>
> It cannot be a hardware defect because I have several equal machines and they all have the same problem.
>
> This error occures since Kernel version 2.6.22.19.
> I have tried several kernels like 2.6.23, 2.6.24., 2.6.24.7, 2.6.25, 2.6.26, 2.6.28.2
> With all of them the system hangs without any message.
> Before 2.6.22.19 I had no problems at all.
Does this mean that, for example, 2.6.22.18 works? Or do you mean
2.6.21.x worked and it stopped working starting from 2.6.22.x?
On Fri, Feb 6, 2009 at 11:06 AM, Alexander Naumann
<A.Naumann@xxxxxxxxxxx> wrote:
> I have found some configuration of linux kernel so that this hang up does not occur:
> In 2.6.24.7 I have enabled ALL kernel debug options, the system did not crash.
I guess 'crashing.config' is _not_ the config you mention here (it
doesn't enable any of the kernel debugging options)?
On Fri, Feb 6, 2009 at 11:06 AM, Alexander Naumann
<A.Naumann@xxxxxxxxxxx> wrote:
> Also I have included two config files for kernel 2.6.22.19.
> With one of them the system chrashes (crashing.config) with the other one it does not (working.config).
> The only difference between these two config files is that I enabled âKernel Debugâ.
>
> With all other tested kernel version I cannot see the same behaviour, they all crash after some hours, it does not matter how kernel is configured.
Just enabling CONFIG_DEBUG_KERNEL doesn't change anything so it's
probably just that the bug is hard to trigger.
On Fri, Feb 6, 2009 at 11:06 AM, Alexander Naumann
<A.Naumann@xxxxxxxxxxx> wrote:
> cat /proc/cpuinfo:
> processorÂÂÂÂÂÂ : 0
> vendor_idÂÂÂÂÂÂ : CentaurHauls
> cpu familyÂÂÂÂÂ : 6
> modelÂÂÂÂÂÂÂÂÂÂ : 9
> model nameÂÂÂÂÂ : VIA Nehemiah
> steppingÂÂÂÂÂÂÂ : 8
> cpu MHzÂÂÂÂÂÂÂÂ : 998.537
> cache sizeÂÂÂÂÂ : 64 KB
> fdiv_bugÂÂÂÂÂÂÂ : no
> hlt_bugÂÂÂÂÂÂÂÂ : no
> f00f_bugÂÂÂÂÂÂÂ : no
> coma_bugÂÂÂÂÂÂÂ : no
> fpuÂÂÂÂÂÂÂÂÂÂÂÂ : yes
> fpu_exceptionÂÂ : yes
> cpuid levelÂÂÂÂ : 1
> wpÂÂÂÂÂÂÂÂÂÂÂÂÂ : yes
> flagsÂÂÂÂÂÂÂÂÂÂ : fpu vme de pse tsc msr cx8 sep mtrr pge cmov pat mmx fxsr sse up rng rng_en ace ace_en
> bogomipsÂÂÂÂÂÂÂ : 1997.07
> clflush sizeÂÂÂ : 32
> power management:
This makes me wonder if there's some generic x86 change that's
behaving badly on VIA CPUs that don't get as much test coverage. One
thing you might want to try out is to disable CONFIG_X86_GENERIC and
use CONFIG_MVIAC3_2 instead of CONFIG_M586.
If that doesn't work out, you might want to strip your config to bare
minimum to see if the hang goes away. Looking at your config, at least
CONFIG_SMP, CONFIG_AGP, and CONFIG_DRM strike as good suspects.
Hope this helps,
Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/