Re: [PATCH] fork: defer linking file vma until vma is fully initialized

From: Andrew Morton
Date: Wed Apr 10 2024 - 16:22:06 EST


On Wed, 10 Apr 2024 17:14:41 +0800 Miaohe Lin <linmiaohe@xxxxxxxxxx> wrote:

> Thorvald reported a WARNING [1]. And the root cause is below race:
>
> CPU 1 CPU 2
> fork hugetlbfs_fallocate
> dup_mmap hugetlbfs_punch_hole
> i_mmap_lock_write(mapping);
> vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree.
> i_mmap_unlock_write(mapping);
> hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem!
> i_mmap_lock_write(mapping);
> hugetlb_vmdelete_list
> vma_interval_tree_foreach
> hugetlb_vma_trylock_write -- Vma_lock is cleared.
> tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem!
> hugetlb_vma_unlock_write -- Vma_lock is assigned!!!
> i_mmap_unlock_write(mapping);
>
> hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside
> i_mmap_rwsem lock while vma lock can be used in the same time. Fix this
> by deferring linking file vma until vma is fully initialized. Those vmas
> should be initialized first before they can be used.

Cool. I queued this in mm-hotfixes (for 6.8-rcX) and I added a cc:stable.