Re: Random reboots

From: Ryan Richter
Date: Wed Feb 15 2006 - 09:26:12 EST


On Tue, Feb 14, 2006 at 11:22:22PM +0100, Jean Delvare wrote:
> You seem to have hardware monitoring drivers loaded on the system, so
> I'd suggest that you watch the returned values over time. If the
> hardware is going wrong it might show there. Your system could be
> overheating for some reason (stuck fan...)
>
> The fact that older kernels were seemingly working better doesn't prove
> much. You were running these kernels before, not now, and hardware
> *does* age, contrary to what people seem to think. If you want to make
> certain that older kernels were indeed working better for purely
> software reasons, you should switch back to such an old kernel and see
> if things actually improve or not.
>
> Note that the first case ("a kernel came out that fixed the problem")
> doesn't mean that the hardware was not at fault. There are quite a few
> quirks in the Linux kernel code which are just there to workaround
> known hardware or BIOS bugs.

No, the old kernels still have all the bugs they ever did (of course).
I tested it during the st-iommu-doublefree debugging. I do not plan on
running the old kernel again, mainly because it has so many irritating
bugs (df doesn't work, the serial console stalls on boot, so it won't
boot without handholding, etc. etc.). I'd have to run it for at least a
month to verify, and the old kernel has security vulnerabilities and so
on.

The sensors report a bunch of obvious nonsesne as always... I keep them
configured in with the hope that one day they'll report useful
information, but that day hasn't come yet. I just checked, and all the
fans are still fine. It's in a huge case with lots of fans and it's
hardly warmer than room temp. The opteron 240s don't put out much heat.

I'm still thououghly convinced it's a kernel bug.

-ryan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/