Re: EDAC i5000 MC0: NON-FATAL ERRORS

From: Doug Thompson
Date: Mon Jun 02 2008 - 14:46:57 EST



--- Jack Howarth <howarth@xxxxxxxxxxxxxxxxx> wrote:

> I am seeing the following errors on a Fedora 7 x86_64 linux box,
> running on a Tyan Tempest i5000XL motherboard, after upgrading from
> kernel-2.6.25-14.fc9.x86_64 to kernel-2.6.25.3-18.fc9.x86_64...
>
> May 25 04:30:56 fourier kernel: EDAC i5000 MC0: NON-FATAL ERRORS Found!!! 1st NON-FATAL Err Reg=
> 0x10000
> May 25 04:30:56 fourier kernel: EDAC MC0: CE row 1, channel 0, label "": (Branch=0 DRAM-Bank=3
> RDWR=Read RAS=14339 CAS=672, CE Err=0x10000)
>
> These messages occur about once an hour and are always for the same DRAM-Bank.
> I've not been able to find any memory errors when running memtest86 with or
> without ECC checking being enabled. Are there any known issues with the
> EDAC support in the kernel that might cause false positives like this?
> The errors are always marked as non-fatal and are for reads. Also, does
> anyone know how this code numbers ram banks? Is the first ram bank considered
> 0 or 1 by this code? Thanks in advance for any clarifications.
> Jack

Yes, it is a known false positive bug.

The hardware has some type of error which it calls NON-FATAL, and the driver is TOO verbose in
reporting that event. I am working on a patch to quiet that down

thanks

doug t


> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


W1DUG
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/