Re: [PATCH] mm: fix lazy vmap purging (use-after-free error)

From: Paul E. McKenney
Date: Sat Feb 21 2009 - 22:00:43 EST


On Sat, Feb 21, 2009 at 07:37:20PM +0100, Vegard Nossum wrote:
> 2009/2/21 Vegard Nossum <vegard.nossum@xxxxxxxxx>:
> > Here's the disassembly (I hope it won't wrap):
> >
> > 0xc1073ec0 <rcu_check_callbacks+0>: push %ebp
> > 0xc1073ec1 <rcu_check_callbacks+1>: test %edx,%edx
> > 0xc1073ec3 <rcu_check_callbacks+3>: mov %esp,%ebp
> > 0xc1073ec5 <rcu_check_callbacks+5>: push %ebx
> > 0xc1073ec6 <rcu_check_callbacks+6>: mov %eax,%ebx
> > 0xc1073ec8 <rcu_check_callbacks+8>: je 0xc1073f08
> > <rcu_check_callbacks+72>
> > 0xc1073eca <rcu_qsctr_inc+0>: mov $0xc1771320,%eax
> > 0xc1073ecf <rcu_qsctr_inc+5>: add -0x3e8fa900(,%ebx,4),%eax
> > 0xc1073ed6 <rcu_qsctr_inc+12>: mov (%eax),%edx
> > 0xc1073ed8 <rcu_qsctr_inc+14>: movb $0x1,0xc(%eax)
> > 0xc1073edc <rcu_qsctr_inc+18>: mov %edx,0x8(%eax)
> > 0xc1073edf <rcu_bh_qsctr_inc+0>: mov $0xc1771380,%eax
> > 0xc1073ee4 <rcu_bh_qsctr_inc+5>: add -0x3e8fa900(,%ebx,4),%eax
> > 0xc1073eeb <rcu_bh_qsctr_inc+12>: mov (%eax),%edx
> > 0xc1073eed <rcu_bh_qsctr_inc+14>: movb $0x1,0xc(%eax)
> > 0xc1073ef1 <rcu_bh_qsctr_inc+18>: mov %edx,0x8(%eax)
> > 0xc1073ef4 <rcu_check_callbacks+52>: mov $0x8,%eax
> >
> > Seems to be rcu_qsctr_inc() that reloads %edx. If I'd guess, I'd say
> > x86's per_cpu macros. But it seems so strange that the corruption
> > would not manifest in other ways too.
> >
>
> Okay, I don't really think it's an error. The if (user) test happens
> at the very beginning and gcc decides to reuse %edx. GDB doesn't know
> this, so it thinks the parameter changed, but at this point the
> parameter simply won't be used anymore.
>
> So you're right: The value can't be trusted (after entry, anyway).

OK. So at least the compiler is sane. ;-)

And the fact that RCU Classic behaves the same as hierarchical RCU
pretty clearly points at some issue with the quiescent-state check code:

void rcu_check_callbacks(int cpu, int user)
{
if (user ||
(idle_cpu(cpu) && !in_softirq() &&
hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
rcu_qsctr_inc(cpu);
rcu_bh_qsctr_inc(cpu);
} else if (!in_softirq()) {
rcu_bh_qsctr_inc(cpu);
}
raise_softirq(RCU_SOFTIRQ);
}

In the case you traced earlier, we interrupted out of kernel code, yet
somehow arrived at rcu_qsctr_inc(). We know that "user" really was 0,
thanks to your careful analysis, so the issue must be in the other
clause. Since we interrupted out of mainline kernel code, in_softirq()
should have returned 0, and hardirq_count() should also have met the
above condition.

You mentioned some concern about idle_cpu() separately, and if idle_cpu()
was returning 1, then RCU would most certainly decide that it was in a
quiescent state and that it could end the current grace period.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/