RE: [PATCH] x86, MCE: support memory error recovery for both UCNA and Deferred error in machine_check_poll
From: Luck, Tony
Date: Thu Oct 23 2014 - 13:19:23 EST
> The general idea of preemptively poisoning pages which contain deferred
> errors is fine though.
Agreed. I used to think that it wasn't likely to be very useful because in many
cases the UCNA errors are just a trail of breadcrumbs set by different units
on the chip as the poison passed through on the way to consumption - where
there would be a fatal (or recoverable) error.
But recently I found that a partial write to a poisoned cache line only sets the
trail of UCNA errors - there is no consumption, so no machine check. So in
this case it would definitely be worthwhile to trigger the same action that we
do for SRAO to unmap the page before someone does do a read.
-Tony