My understanding is that ECC corrects one-bit errors only when a
memory location is read. It would be very cool to have a kernel
thread (perhaps in the idle task) that slowly goes through physical
memory reading every byte. That would allow one-bit errors to be
corrected before they become two-bit errors.
Cycling through memory relatively infrequently (on the order of once
an hour, maybe even one or two times a day) would probably be often
enough, so cache pollution wouldn't be a concern.
We have a lot of Linux machines here to test out a patches to do this,
help with debugging, etc. :-)
Dan
-- Daniel Quinlan (at work) quinlan@transmeta.com- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/