Aw: Re: Re: new module to check constant memory for corruption
From: Alexander . Kleinsorge
Date: Sun Apr 13 2014 - 12:23:21 EST
Hi Andi,
Â
1. I build in a check if ftrace is enabled. (like: cat /debug/tracing/tracing_enabled != 1)
2. Main goal is to detect: real ram errors (non crc systems = normal pc). This happens more often than you think:
DRAM Errors in the Wild: A Large-Scale Field Study, Mai 2009 (http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf)
The rate is about 1 bit per GB per year!
3. I read a blog some weeks ago where an Android developer had problem with accidentally corrupted kernel memory (on ARM).
And he asked for the x86 feature [arch\x86\mm\init_32.c + init_64.c / void mark_rodata_ro()]. He wasted long time debugging it.
4. If kernel text or data is wrong by 1 bit (ram error) system can crash, but does not have to!
The worst case: this single bit is producing similar but runnable code.
But this range of memory is the most critical (even if only 1% of all ram).
Alex
-----------------------------------------------------
Gesendet:ÂSonntag, 13. April 2014 um 17:55 Uhr
Von:Â"Andi Kleen" <andi@xxxxxxxxxxxxxx>
An:ÂAlexander.Kleinsorge@xxxxxx
Cc:Â"Andi Kleen" <andi@xxxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx
Betreff:ÂRe: Re: new module to check constant memory for corruption
> your question: there are no writes in this write protected adress range (e.g. kernel code).
It's actually not true, Linux changes r/o code. But you could
handle that by hooking into the right places.
> my idea is to calculate a checksum (xor is fastest) over this range and check later (periodically) if its unchanged.
> see source code download (5 KB): http://tauruz.homeip.net/ramcheck.tgz
> the code is working fine and the checksum is (as expected) constant (at least for many hours).
>
So is the goal security or reliability or debugging?
Reliability:
I have doubts it makes sense for that. On most system the code
is only a very small part of the total memory. So you wouldn't
cover most data.
Also if something corrupts the code we likely already detect it
eventually by crashing. Your module would need to panic too in this
case.
Security: If someone can change the code what stops them from changing
the checksum module too?
Also if you use a poor (= fast) checksum it's likely easy to construct
a valid patch that does not change the checksum.
Debugging:
Maybe, but I have never seen a bug where code got corrupted.
The user program technique works reasonably well for finding bad
pointers. Write a program that allocates a lot of memory. Regularly
checksum and recheck all its memory.
-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/