Re: EDAC (was: Re: 2.6.14-rc5-mm1)

From: Sander
Date: Thu Oct 27 2005 - 01:27:55 EST


(Cc: trimmed)

Doug Thompson wrote (ao):
> --- Sander <sander@xxxxxxxxxxx> wrote:
> > Doug Thompson wrote (ao):
> > > This bridge is either having a parity error on
> > > the main bus OR more likely is generating false
> > > positives. How to determine which? More investigation
> > > is needed.
> >
> > Anything I can do?
>
> To help? Keep an eye for other devices which post
> parity errors.

Ok :-) But you also say more investigation is needed. Is this something
I can help with, as an owner of this hardware?

And is there an EDAC list which reports should go to or is lkml fine?
There is no MAINTAINERS entry or info in Documentation (in
2.6.14-rc4-mm1).

If not, is it useful if I make a website and collect reports?

> > If so, does it make more sense not to configure EDAC?
>
> depends on your requirements.
>
> we have been living with systems with PCI devices for
> a decade now. how many times have events occurred that
> had no explaination and are simply dismissed? There
> were no detectors.
>
> We assume many things, even today. How many desktops
> with gigs of memory have no ECC? I have learned my
> lesson while refactoring bluesmoke/edac that ECC is
> very important. ECC always in my machines for now on,
> for me anyway.
>
> For PCI devices, if you want to "know" data is being
> transmitted correctly, then there needs to be
> "detector" and "reporter" and "handler" agents of this
> bad events to properly notice, report and process
> them.

I'd better configure EDAC :-)

Thanks for your answers Doug.

--
Humilis IT Services and Solutions
http://www.humilis.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/