Re: [PATCH] mm: fix lazy vmap purging (use-after-free error)

From: Vegard Nossum
Date: Sat Feb 21 2009 - 13:09:14 EST


2009/2/21 Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>:
>> rcu_check_callbacks (cpu=0, user=0) at kernel/rcutree.c:949
>> 949 {
>> ...
>> rcu_check_callbacks (cpu=0, user=-1049147360) at kernel/rcutree.c:967
>> 967 rcu_qsctr_inc(cpu);
>
> ???? Are the argument values trustworthy? If so, I don't see how
> the variable user transitioned from zero to non-zero.
>
> The value user!=0 tells RCU that we were interrupted from a user process,
> but this immediately follows user==0. If we really were interrupted
> from kernel code, (including from an irq handler) we should have user==0.
>
> The user!=0 causes RCU to conclude that we are in a quiescent state.
>
> RCU is then within its rights to process callbacks, which would result
> in the behavior you saw.

Ah, curious. Thanks for the explanation.

I tried again, just to be sure:

Breakpoint 1, rcu_check_callbacks (cpu=0, user=0) at kernel/rcutree.c:949
949 {
(gdb) p &user
Address requested for identifier "user" which is in register $edx
(gdb) p user
$1 = 0
(gdb) s
950 if (user ||
(gdb)
949 {
(gdb)
950 if (user ||
(gdb)
idle_cpu (cpu=0) at kernel/sched.c:5196
5196 return cpu_curr(cpu) == cpu_rq(cpu)->idle;
(gdb)
5197 }
(gdb)
idle_cpu (cpu=<value optimized out>) at kernel/sched.c:5196
5196 return cpu_curr(cpu) == cpu_rq(cpu)->idle;
(gdb)
5197 }
(gdb)
rcu_check_callbacks (cpu=0, user=-1049147360) at kernel/rcutree.c:967
967 rcu_qsctr_inc(cpu);

Could that be a missing "d" clobber in some inline assembly? Or a
miscompilation?

Here's the disassembly (I hope it won't wrap):

0xc1073ec0 <rcu_check_callbacks+0>: push %ebp
0xc1073ec1 <rcu_check_callbacks+1>: test %edx,%edx
0xc1073ec3 <rcu_check_callbacks+3>: mov %esp,%ebp
0xc1073ec5 <rcu_check_callbacks+5>: push %ebx
0xc1073ec6 <rcu_check_callbacks+6>: mov %eax,%ebx
0xc1073ec8 <rcu_check_callbacks+8>: je 0xc1073f08
<rcu_check_callbacks+72>
0xc1073eca <rcu_qsctr_inc+0>: mov $0xc1771320,%eax
0xc1073ecf <rcu_qsctr_inc+5>: add -0x3e8fa900(,%ebx,4),%eax
0xc1073ed6 <rcu_qsctr_inc+12>: mov (%eax),%edx
0xc1073ed8 <rcu_qsctr_inc+14>: movb $0x1,0xc(%eax)
0xc1073edc <rcu_qsctr_inc+18>: mov %edx,0x8(%eax)
0xc1073edf <rcu_bh_qsctr_inc+0>: mov $0xc1771380,%eax
0xc1073ee4 <rcu_bh_qsctr_inc+5>: add -0x3e8fa900(,%ebx,4),%eax
0xc1073eeb <rcu_bh_qsctr_inc+12>: mov (%eax),%edx
0xc1073eed <rcu_bh_qsctr_inc+14>: movb $0x1,0xc(%eax)
0xc1073ef1 <rcu_bh_qsctr_inc+18>: mov %edx,0x8(%eax)
0xc1073ef4 <rcu_check_callbacks+52>: mov $0x8,%eax

Seems to be rcu_qsctr_inc() that reloads %edx. If I'd guess, I'd say
x86's per_cpu macros. But it seems so strange that the corruption
would not manifest in other ways too.

Stand by for further investigations :-)


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/