RE: [PATCH v2 2/2] mce: acpi/apei: Add a boot option to disable ffmode for corrected errors
From: Luck, Tony
Date: Wed Jun 19 2013 - 18:09:24 EST
> The above question about what to do *without* going to userspace and
> back is maybe more interesting and we'd need a clean design there...
> we'll see.
Yes - this case (where the BIOS did all the threshold math and made the decision)
should be one where Linux kernel could just implement the action directly.
Perhaps controlled by a knob to say whether we really trust the BIOS that much.
But we will also have cases where a smart user agent can correlate data
from multiple sources to identify the real root cause (e.g. some temperature
anomalies around the same time as some memory errors that occur at 10am
on the third Tuesday each month -> cause is air conditioner maintenance guy
that shuts down the a/c for 10 minutes to change the filter).
I'll leave writing an agent that smart as an exercise for the concerned data
center manager :-)