Re: noisy edac

From: doug thompson
Date: Mon Jan 30 2006 - 16:02:57 EST


On Mon, 2006-01-30 at 11:58 -0800, Dave Peterson wrote:
> On Monday 30 January 2006 10:59, Doug Thompson wrote:
> > that driver should be refactored to only output NON-FATALs with debug
> > turned on.
>
> I would prefer a sysfs option or something similar that allows the user
> to determine what action to take on these errors. I think the debug
> option should only pertain to messages whose purpose is for debugging
> the EDAC code itself, as opposed to hardware errors detected by EDAC.

Something like an ERROR report verbose level? 0 to 7 like?

0 being quiet, 7 being very verbose? or the reverse.

/sys/drivers/system/edac/mc/error_report_verbosity ????

This tackles the immediate issue, but there is a systemic issue we have
to face sometime.

One problem that this e752x_edac module exhibits, which is manifest on
all of the drivers to one degree, is the output of driver specific error
messages directly, since there is not an abstracted error interface
(yet) in the EDAC core. The messages are or can be very specific to the
MC being driven. In time, we can (should) add a better MC error
interface to the core and then map errors from specific MC errors to the
new CORE error interface. Similiar to how SCSI and SATA have higher
level abstract errors which the transport drivers map errors to.

This e752x_edac module just plainly outputs to printk() with
KERN_WARNING w/o any other output control.

Looks like the old "how do we report errors" pattern, with its first
implementation now looking old.

doug t



>
> > Copying to edac/bluesmoke mailing list
> >
> > doug t
> >
> > --- Dave Jones <davej@xxxxxxxxxx> wrote:
> > > On Sun, Jan 29, 2006 at 04:52:06PM -0500, Alan Cox wrote:
> > > > On Thu, Jan 26, 2006 at 08:41:05PM -0500, Dave Jones wrote:
> > > > > e752x_edac is very noisy on my PCIE system..
> > > > > my dmesg is filled with these...
> > > > >
> > > > > [91671.488379] Non-Fatal Error PCI Express B
> > > > > [91671.492468] Non-Fatal Error PCI Express B
> > > > > [91901.100576] Non-Fatal Error PCI Express B
> > > > > [91901.104675] Non-Fatal Error PCI Express B
> > > >
> > > > Pre-production system or final release ?
> > >
> > > Currently shipping Dell Precision 470.
> > >
> > > Dave
>


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/