Re: [PATCH] mm: fix lazy vmap purging (use-after-free error)

From: Ingo Molnar
Date: Fri Feb 20 2009 - 11:02:26 EST



* Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:

> 2009/2/20 Ingo Molnar <mingo@xxxxxxx>:
> >
> > * Ingo Molnar <mingo@xxxxxxx> wrote:
> >
> >> ah, indeed:
> >>
> >> list_del_rcu(&va->list);
> >>
> >> i suspect it could be hit big time in a workload that opens
> >> more than 512 files, as expand_files() uses a
> >> vmalloc()+vfree() pair in that case.
> >
> > hm, perhaps it's not a problem after all. The freeing is done
> > via rcu, and list_del_rcu() leaves the forward pointer intact.
>
> Well, it's not the particular line that you posted, in any case.
> That's &va->list, but the traversed list is &va->purge_list.
>
> I thought it would be the line:
>
> call_rcu(&va->rcu_head, rcu_free_va);
>
> (which does kfree() in the callback) that was the problem.
>
> >
> > So how did it happen that the entry got kfree()d before the loop
> > was done? We are in a spinlocked section so the CPU should not
> > have entered rcu processing.
>
> I added some printks to __free_vmap_area() and rcu_free_va(), and it
> shows that the kfree() is being called immediately (inside the list
> traversal). So the call_rcu() is happening immediately (or almost
> immediately).
>
> If I've understood correctly, the RCU processing can happen inside a
> spinlock, as long as interrupts are enabled. (Won't the timer IRQ
> trigger softirq processing, which triggers RCU callback processing,
> for example?)
>
> And interrupts are enabled when this happens: EFLAGS: 00000292
>
> Please correct me if I am wrong!

The timer irq will do RCU garbage-collection - but only of
entries where a grace period has passed.

Otherwise there would be no point in using RCU at all, if the
kfree() can happen immediately. RCU is about delaying action,
and doing it at a point in time when we sure are in a quiescent
state. (we have done a context-switch or scheduled to idle)

The question is, is this piece of loop traversal code
preemptible? I dont think it is since it's embedded in a
spinlock:

spin_lock(&vmap_area_lock);
- list_for_each_entry(va, &valist, purge_list)
+ list_for_each_entry_safe(va, n_va, &valist, purge_list)
__free_vmap_area(va);
spin_unlock(&vmap_area_lock);

[ on -rt this could be preemptible and this would be a real fix
there. I just dont see how the kfree() can execute on
mainline. Obviously it did, since you got the kmemcheck
assert. ]

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/