Re: [PATCH v6 13/16] mm: introduce vma_ensure_detached()
From: Peter Zijlstra
Date: Tue Dec 17 2024 - 05:26:56 EST
On Mon, Dec 16, 2024 at 11:24:16AM -0800, Suren Baghdasaryan wrote:
> vma_start_read() can temporarily raise vm_refcnt of a write-locked and
> detached vma:
>
> // vm_refcnt==1 (attached)
> vma_start_write()
> vma->vm_lock_seq = mm->mm_lock_seq
>
> vma_start_read()
> vm_refcnt++; // vm_refcnt==2
>
> vma_mark_detached()
> vm_refcnt--; // vm_refcnt==1
>
> // vma is detached but vm_refcnt!=0 temporarily
>
> if (vma->vm_lock_seq == mm->mm_lock_seq)
> vma_refcount_put()
> vm_refcnt--; // vm_refcnt==0
>
> This is currently not a problem when freeing the vma because RCU grace
> period should pass before kmem_cache_free(vma) gets called and by that
> time vma_start_read() should be done and vm_refcnt is 0. However once
> we introduce possibility of vma reuse before RCU grace period is over,
> this will become a problem (reused vma might be in non-detached state).
> Introduce vma_ensure_detached() for the writer to wait for readers until
> they exit vma_start_read().
So aside from the lockdep problem (which I think is fixable), the normal
way to fix the above is to make dec_and_test() do the kmem_cache_free().
Then the last user does the free and everything just works.