Re: ftrace introduces instability into kernel 2.6.27(-rc2,-rc3)

From: Steven Rostedt
Date: Mon Aug 18 2008 - 12:12:35 EST


Mathieu Desnoyers wrote:

Steve ? I just noticed this :


ftrace_modify_code(unsigned long ip, unsigned char *old_code,
unsigned char *new_code)
{
unsigned replaced;
unsigned old = *(unsigned *)old_code;
unsigned new = *(unsigned *)new_code;
int faulted = 0;

/*
* Note: Due to modules and __init, code can
* disappear and change, we need to protect against faulting
* as well as code changing.
*
* No real locking needed, this code is run through
* kstop_machine.
*/
asm volatile (
"1: lwz %1, 0(%2)\n"
" cmpw %1, %5\n"
" bne 2f\n"
" stwu %3, 0(%2)\n"
"2:\n"
".section .fixup, \"ax\"\n"
"3: li %0, 1\n"
" b 2b\n"
".previous\n"
".section __ex_table,\"a\"\n"
_ASM_ALIGN "\n"
_ASM_PTR "1b, 3b\n"
".previous"
: "=r"(faulted), "=r"(replaced)
: "r"(ip), "r"(new),
"0"(faulted), "r"(old)
: "memory");

if (replaced != old && replaced != new)
faulted = 2;

if (!faulted)
flush_icache_range(ip, ip + 8);

return faulted;
}

What happens if you are really unlucky and get the following execution
sequence ?


Load module A at address 0xddfc3e00
Populate ftrace function table while the system runs
Unload module A
Load module B at address 0xddfc3e00
Activate ftrace
-> At that point, since there is another module loaded in the same
address space, it won't fault. If there happens to be an instruction
which looks exactly like the instruction we are replacing at the same
location, we are introducing a bug. True both on x86 ans powerpc...


Hi Mathieu,

Yep I know of this issue, and it is very unlikely it would happen. But that's not good enough, so this is why I did this:

http://marc.info/?l=linux-kernel&m=121876928124384&w=2
http://marc.info/?l=linux-kernel&m=121876938524523&w=2

;-)

On powerpc it would be less likely an issue since the code segments are all 4 bytes aligned. And a call being replaced would be a call to mcount (relative jump). A call to mcount would be the same for both. Then again, we could be replacing a nop, but most likely that wouldn't hurt either, since nops are seldom used, and if we did call mcount, it would probably not hurt. But it would be an issue if Module B happen to put a data section that matched the code to jmp to mcount or a nop.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/