Re: [PATCH] mm/vmalloc: Keep a separate lazy-free list

From: Roman Peniaev
Date: Thu Apr 14 2016 - 10:44:54 EST


On Thu, Apr 14, 2016 at 3:49 PM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Apr 14, 2016 at 03:13:26PM +0200, Roman Peniaev wrote:
>> Hi, Chris.
>>
>> Is it made on purpose not to drop VM_LAZY_FREE flag in
>> __purge_vmap_area_lazy()? With your patch va->flags
>> will have two bits set: VM_LAZY_FREE | VM_LAZY_FREEING.
>> Seems it is not that bad, because all other code paths
>> do not care, but still the change is not clear.
>
> Oh, that was just a bad deletion.
>
>> Also, did you consider to avoid taking static purge_lock
>> in __purge_vmap_area_lazy() ? Because, with your change
>> it seems that you can avoid taking this lock at all.
>> Just be careful when you observe llist as empty, i.e.
>> nr == 0.
>
> I admit I only briefly looked at the lock. I will be honest and say I
> do not fully understand the requirements of the sync/force_flush
> parameters.

if sync:
o I can wait for other purge in progress
(do not care if purge_lock is dropped)

o purge fragmented blocks

if force_flush:
o even nothing to purge, flush TLB, which is costly.
(again sync-like is implied)

> purge_fragmented_blocks() manages per-cpu lists, so that looks safe
> under its own rcu_read_lock.
>
> Yes, it looks feasible to remove the purge_lock if we can relax sync.

what is still left is waiting on vmap_area_lock for !sync mode.
but probably is not that bad.

>
>> > @@ -706,6 +703,8 @@ static void purge_vmap_area_lazy(void)
>> > static void free_vmap_area_noflush(struct vmap_area *va)
>> > {
>> > va->flags |= VM_LAZY_FREE;
>> > + llist_add(&va->purge_list, &vmap_purge_list);
>> > +
>> > atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr);
>>
>> it seems to me that this a very long-standing problem: when you mark
>> va->flags as VM_LAZY_FREE, va can be immediately freed from another CPU.
>> If so, the line:
>>
>> atomic_add((va->va_end - va->va_start)....
>>
>> does use-after-free access.
>>
>> So I would also fix it with careful line reordering with barrier:
>> (probably barrier is excess here, because llist_add implies cmpxchg,
>> but I simply want to be explicit here, showing that marking va as
>> VM_LAZY_FREE and adding it to the list should be at the end)
>>
>> - va->flags |= VM_LAZY_FREE;
>> atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr);
>> + smp_mb__after_atomic();
>> + va->flags |= VM_LAZY_FREE;
>> + llist_add(&va->purge_list, &vmap_purge_list);
>>
>> What do you think?
>
> Yup, it is racy. We can drop the modification of LAZY_FREE/LAZY_FREEING
> to ease one headache, since those bits are not inspected anywhere afaict.

Yes, those flags can be completely dropped.

> Would not using atomic_add_return() be even clearer with respect to
> ordering:
>
> nr_lazy = atomic_add_return((va->va_end - va->va_start) >> PAGE_SHIFT,
> &vmap_lazy_nr);
> llist_add(&va->purge_list, &vmap_purge_list);
>
> if (unlikely(nr_lazy > lazy_max_pages()))
> try_purge_vmap_area_lazy();
>
> Since it doesn't matter that much if we make an extra call to
> try_purge_vmap_area_lazy() when we are on the boundary.

Nice.

--
Roman

> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre