RE: [PATCH v22] edac, ras/hw_event.h: use events to handle hw issues

From: Luck, Tony
Date: Thu May 10 2012 - 18:37:53 EST


kworker/u:6-201 [007] .N.. 186.197280: mc_error: [Hardware Error]: mem_ctl#0: Corrected error memory read error on memory stick "DIMM_A1" (channel:0 slot:1 page:0x2f1eb3 offset:0x446 grain:32 syndrome:0x0 1 error(s): Unknown: Err=0001:0090 socket=0 channel=0/mask=1 rank=5)

The word "error" appears *five* times on this line (once with a capital E).
I feel beaten, bruised and ready to give up on this machine with just one
actual error reported :-)

We could get rid of one by:
s/Corrected error memory read error/Corrected memory read error/

(though we'd need to see if things still read well for all other "msg" options.

Or perhaps it could say:
... Corrected error: memory read on memory stick ...
or even:
... Corrected error: read on memory stick ...

This part could get shortened too:
mc_error: [Hardware Error]:
will mc_error ever report something that isn't a "Hardware Error"?
I don't think we have to preserve this legacy string when moving
to a new reporting mechanism.


> There are still some space to improve the fields provided by the drivers.
Apart from reporting "channel" twice, that doesn't look too bad. Maybe
the "1 error(s)" could say "count: 1"?

-Tony

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/