Re: [PATCH] mm: fix lazy vmap purging (use-after-free error)

From: Paul E. McKenney
Date: Sat Feb 21 2009 - 13:33:27 EST


On Sat, Feb 21, 2009 at 07:08:55PM +0100, Vegard Nossum wrote:
> 2009/2/21 Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>:
> >> rcu_check_callbacks (cpu=0, user=0) at kernel/rcutree.c:949
> >> 949 {
> >> ...
> >> rcu_check_callbacks (cpu=0, user=-1049147360) at kernel/rcutree.c:967
> >> 967 rcu_qsctr_inc(cpu);
> >
> > ???? Are the argument values trustworthy? If so, I don't see how
> > the variable user transitioned from zero to non-zero.
> >
> > The value user!=0 tells RCU that we were interrupted from a user process,
> > but this immediately follows user==0. If we really were interrupted
> > from kernel code, (including from an irq handler) we should have user==0.
> >
> > The user!=0 causes RCU to conclude that we are in a quiescent state.
> >
> > RCU is then within its rights to process callbacks, which would result
> > in the behavior you saw.
>
> Ah, curious. Thanks for the explanation.
>
> I tried again, just to be sure:
>
> Breakpoint 1, rcu_check_callbacks (cpu=0, user=0) at kernel/rcutree.c:949
> 949 {
> (gdb) p &user
> Address requested for identifier "user" which is in register $edx
> (gdb) p user
> $1 = 0
> (gdb) s
> 950 if (user ||
> (gdb)
> 949 {
> (gdb)
> 950 if (user ||
> (gdb)
> idle_cpu (cpu=0) at kernel/sched.c:5196
> 5196 return cpu_curr(cpu) == cpu_rq(cpu)->idle;
> (gdb)
> 5197 }
> (gdb)
> idle_cpu (cpu=<value optimized out>) at kernel/sched.c:5196
> 5196 return cpu_curr(cpu) == cpu_rq(cpu)->idle;
> (gdb)
> 5197 }
> (gdb)
> rcu_check_callbacks (cpu=0, user=-1049147360) at kernel/rcutree.c:967
> 967 rcu_qsctr_inc(cpu);
>
> Could that be a missing "d" clobber in some inline assembly? Or a
> miscompilation?

Hmmm... cpu_rq() does invoke per_cpu()...

> Here's the disassembly (I hope it won't wrap):
>
> 0xc1073ec0 <rcu_check_callbacks+0>: push %ebp
> 0xc1073ec1 <rcu_check_callbacks+1>: test %edx,%edx
> 0xc1073ec3 <rcu_check_callbacks+3>: mov %esp,%ebp
> 0xc1073ec5 <rcu_check_callbacks+5>: push %ebx
> 0xc1073ec6 <rcu_check_callbacks+6>: mov %eax,%ebx
> 0xc1073ec8 <rcu_check_callbacks+8>: je 0xc1073f08
> <rcu_check_callbacks+72>
> 0xc1073eca <rcu_qsctr_inc+0>: mov $0xc1771320,%eax
> 0xc1073ecf <rcu_qsctr_inc+5>: add -0x3e8fa900(,%ebx,4),%eax
> 0xc1073ed6 <rcu_qsctr_inc+12>: mov (%eax),%edx
> 0xc1073ed8 <rcu_qsctr_inc+14>: movb $0x1,0xc(%eax)
> 0xc1073edc <rcu_qsctr_inc+18>: mov %edx,0x8(%eax)
> 0xc1073edf <rcu_bh_qsctr_inc+0>: mov $0xc1771380,%eax
> 0xc1073ee4 <rcu_bh_qsctr_inc+5>: add -0x3e8fa900(,%ebx,4),%eax
> 0xc1073eeb <rcu_bh_qsctr_inc+12>: mov (%eax),%edx
> 0xc1073eed <rcu_bh_qsctr_inc+14>: movb $0x1,0xc(%eax)
> 0xc1073ef1 <rcu_bh_qsctr_inc+18>: mov %edx,0x8(%eax)
> 0xc1073ef4 <rcu_check_callbacks+52>: mov $0x8,%eax
>
> Seems to be rcu_qsctr_inc() that reloads %edx. If I'd guess, I'd say
> x86's per_cpu macros. But it seems so strange that the corruption
> would not manifest in other ways too.
>
> Stand by for further investigations :-)

I will look into this, but it will take a bit.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/