RE: [RFC PATCH 0/3] RAS: Correctable Errors Collector thing

From: Luck, Tony
Date: Wed May 28 2014 - 13:22:25 EST


> A possible alternative would be to soft-offline the page. This is
> currently done in APEI code when corrected memory error thresholds are
> exceeded and reported by UEFI via a generic hardware error source
> (GHES).

+1

This is what the existing mcelog(8) daemon does when it sees an excessive
number of corrected errors on a page (using /sys/devices/system/memory/soft_offline_page
as the user->kernel interface to get to this function).

-Tony