Re: EDAC-AMD64: what is the ecc_msg good for ?

From: Gabriel C
Date: Wed Jan 10 2018 - 18:31:15 EST


On 11.01.2018 00:12, Borislav Petkov wrote:
On Thu, Jan 11, 2018 at 12:06:49AM +0100, Gabriel C wrote:
while doing some testings with a EPYC box I notice
these strange messages when a Node is disabled.

I really do think the message is confusing since
we tell 'Node X: ... disabled' followed by a
INFO on the edac driver which tells the driver will not load.

And that is confusing because?

Beacuse we see the following:

[ 4.694948] EDAC amd64: Node 6: DRAM ECC disabled.
[ 4.694949] EDAC amd64: ECC disabled in the BIOS or no ECC capability, module will not load.
Either enable ECC checking or force module loading by setting 'ecc_enable_override'.
(Note that use of the override may cause unknown side effects.)

The first one tells the Node is disabled the second is a
KERN INFO message telling the *module* will not load.

Telling then *module* will not load for 'this Node' should be clear for everone.

Don't get me wrong for me is clear what this means , I don't need the
second message at all but I have here folks didn't understand wth that means.


Also even worse , we suggest to use ecc_enable_override then,
which can cause wrose things.. We really should not suggest
something like this by default.

That is an remnant from the old times. Family 0x17 and newer doesn't
allow that anymore.


So do we need an < fam17h check for that message then ?