Re: [PATCH] mm: fix lazy vmap purging (use-after-free error)

From: Paul E. McKenney
Date: Fri Feb 20 2009 - 11:49:52 EST


On Fri, Feb 20, 2009 at 05:01:57PM +0100, Ingo Molnar wrote:
>
> * Vegard Nossum <vegard.nossum@xxxxxxxxx> wrote:
>
> > 2009/2/20 Ingo Molnar <mingo@xxxxxxx>:
> > >
> > > * Ingo Molnar <mingo@xxxxxxx> wrote:
> > >
> > >> ah, indeed:
> > >>
> > >> list_del_rcu(&va->list);
> > >>
> > >> i suspect it could be hit big time in a workload that opens
> > >> more than 512 files, as expand_files() uses a
> > >> vmalloc()+vfree() pair in that case.
> > >
> > > hm, perhaps it's not a problem after all. The freeing is done
> > > via rcu, and list_del_rcu() leaves the forward pointer intact.
> >
> > Well, it's not the particular line that you posted, in any case.
> > That's &va->list, but the traversed list is &va->purge_list.
> >
> > I thought it would be the line:
> >
> > call_rcu(&va->rcu_head, rcu_free_va);
> >
> > (which does kfree() in the callback) that was the problem.
> >
> > >
> > > So how did it happen that the entry got kfree()d before the loop
> > > was done? We are in a spinlocked section so the CPU should not
> > > have entered rcu processing.
> >
> > I added some printks to __free_vmap_area() and rcu_free_va(), and it
> > shows that the kfree() is being called immediately (inside the list
> > traversal). So the call_rcu() is happening immediately (or almost
> > immediately).
> >
> > If I've understood correctly, the RCU processing can happen inside a
> > spinlock, as long as interrupts are enabled. (Won't the timer IRQ
> > trigger softirq processing, which triggers RCU callback processing,
> > for example?)
> >
> > And interrupts are enabled when this happens: EFLAGS: 00000292
> >
> > Please correct me if I am wrong!
>
> The timer irq will do RCU garbage-collection - but only of
> entries where a grace period has passed.
>
> Otherwise there would be no point in using RCU at all, if the
> kfree() can happen immediately. RCU is about delaying action,
> and doing it at a point in time when we sure are in a quiescent
> state. (we have done a context-switch or scheduled to idle)

Indeed, preemptable RCU grace periods extend for many milliseconds.

So, Vegard, when you say "immediately", do you mean "within
microseconds" or "faster than my human reflexes can react to"?

Thanx, Paul

> The question is, is this piece of loop traversal code
> preemptible? I dont think it is since it's embedded in a
> spinlock:
>
> spin_lock(&vmap_area_lock);
> - list_for_each_entry(va, &valist, purge_list)
> + list_for_each_entry_safe(va, n_va, &valist, purge_list)
> __free_vmap_area(va);
> spin_unlock(&vmap_area_lock);
>
> [ on -rt this could be preemptible and this would be a real fix
> there. I just dont see how the kfree() can execute on
> mainline. Obviously it did, since you got the kmemcheck
> assert. ]
>
> Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/