Re: [PATCH] RAS/CEC: Add debugfs switch to disable at run time

From: Borislav Petkov
Date: Thu Apr 18 2019 - 20:27:01 EST

Next message: Naoya Horiguchi: "[PATCH v2] tools/power: turbostat: make output buffer extensible (Re: [PATCH v1] tools/power: turbostat: fix buffer overrun)"
Previous message: Bjorn Andersson: "Re: [PATCH 1/1] iommu/arm-smmu: Log CBFRSYNRA register on context fault"
In reply to: Cong Wang: "Re: [PATCH] RAS/CEC: Add debugfs switch to disable at run time"
Next in thread: Cong Wang: "Re: [PATCH] RAS/CEC: Add debugfs switch to disable at run time"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Apr 18, 2019 at 04:58:22PM -0700, Cong Wang wrote:
> No, it is all about whether we should break users' expectation.

What user expectation?

> This doesn't sounds like a valid reason for us to break users'
> expectation.

I think it is *you* who has some sort of "expectation" but that
"expectation" is wrong.

> Prior to CONFIG_RAS, mcelog just works fine for users (at least Intel
> users). Suddenly after enabling CONFIG_RAS in kernel, mcelog will
> no longer receive any correctable memory errors _silently_.

That is, of course, wrong too.

> What's more, we don't even have rasdaemon running in our system, so

Are you saying "we" to mean "we the users" or some company "we"?

And that is wrong too, there's at least one rasdaemon:

http://git.infradead.org/users/mchehab/rasdaemon.git

> there is no consumer of RAS CEC,

RAS CEC doesn't need a consumer. You're misunderstanding the whole
concept of the error collector.

> these errors just simply disappear from users' expected place.

They "disappear" because you have CONFIG_RAS_CEC enabled. But they
don't really disappear - they're collected by the thing to filter out
only the pages which keep generating errors constantly and those get
soft-offlined.

The sporadic ones simply get ignored because they don't happen again
and are only result of alpha particles or overheating conditions or
whatever.

Now here's the CEC help text:

config RAS_CEC
bool "Correctable Errors Collector"
depends on X86_MCE && MEMORY_FAILURE && DEBUG_FS
---help---
This is a small cache which collects correctable memory errors per 4K
page PFN and counts their repeated occurrence. Once the counter for a
PFN overflows, we try to soft-offline that page as we take it to mean
that it has reached a relatively high error count and would probably
be best if we don't use it anymore.

Bear in mind that this is absolutely useless if your platform doesn't
have ECC DIMMs and doesn't have DRAM ECC checking enabled in the BIOS.

you can tell me what in that text is not clear so that I can make it
more clear and obvious what that thing is.

> I know CONFIG_RAS is new feature supposed to replace MCELOG,

No, it isn't. CONFIG_RAS is supposed to collect all the RAS-related
functionality in the kernel and it looks like you have some
misconceptions about it.

> but they can co-exist in kernel config, which means mcelog should
> continue to work as before until it gets fully replaced.

For that you need to enable X86_MCELOG_LEGACY.

And let me repeat it again - if you want to collect errors in userspace,
do not enable RAS_CEC at all.

> Even the following PoC change could make this situation better,
> because with this change when we enable CONFIG_RAS,mcelog
> will break _loudly_ rather than just silently, users will notice mcelog
> is no longer supported and will look for its alternative choice.

You have somehow put in your head that CONFIG_RAS is the counterpart of
CONFIG_X86_MCELOG_LEGACY. Which is *simply* *not* *true*.

And the moment you realize that, then you'll be a step further in the
right direction.

So enable X86_MCELOG_LEGACY and you can collect all the errors you wish.
And there's a rasdaemon which you can use too, as I pointed above, if
you don't want mcelog.

CEC is something *completely* different and its purpose is to run in the
kernel and prevent users and admins from upsetting unnecessarily with
every sporadic correctable error and just because an alpha particle flew
through their DIMMs, they all start running in headless chicken mode,
trying to RMA perfectly good hardware.

Now, if any of that above still doesn't make it clear, please state what
you're trying to achieve and I'll try to help.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Next message: Naoya Horiguchi: "[PATCH v2] tools/power: turbostat: make output buffer extensible (Re: [PATCH v1] tools/power: turbostat: fix buffer overrun)"
Previous message: Bjorn Andersson: "Re: [PATCH 1/1] iommu/arm-smmu: Log CBFRSYNRA register on context fault"
In reply to: Cong Wang: "Re: [PATCH] RAS/CEC: Add debugfs switch to disable at run time"
Next in thread: Cong Wang: "Re: [PATCH] RAS/CEC: Add debugfs switch to disable at run time"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]