As most of these problems are not 100% reproducible, but depend on timing,
temperature, ... it would be nice to have a kernel-thread(?) which does
this. I wouldn't put it in normally, but when I suspect an error in my RAM,
it would be a good idea.
> > The main problem with it, from a technical standpoint, is that unlike
> > ECC all you know is that a page was corrupted, so you have to throw it
> > out. If it was dirty, or in use, what do you do?
If it is dirty, you can't detect the error, as it might have been
dirtied by a write operation. No checksum compare will work.
I wouldn't view this as some kind of ECC, but rather as something like
parity-RAM. You will get notified when you have bad RAM, and at what address,
so you could investigate further.
If someone writes such a thing, I'd recommend adding another feature to it
to enhance the detection ratio if desired:
Give the module a parameter which will change the refresh rate. This trick
is used in the best SW RAMtest I know to check for "weak" bits. These are
commonly caused by a DRAM cell being discharged too fast, thus showing up
more frequently at low refresh.
CU,Andy
-- Andreas Beck | Email : <becka@sunserver1.rz.uni-duesseldorf.de>