Re: [PATCH v3 5/6] mm: always lock new vma before inserting into vma tree
From: Suren Baghdasaryan
Date: Thu Aug 03 2023 - 14:34:55 EST
On Thu, Aug 3, 2023 at 11:26 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Thu, Aug 3, 2023 at 11:15 AM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote:
> >
> > * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> [230803 14:02]:
> > > On Thu, 3 Aug 2023 at 10:27, Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> > > >
> > > > While it's not strictly necessary to lock a newly created vma before
> > > > adding it into the vma tree (as long as no further changes are performed
> > > > to it), it seems like a good policy to lock it and prevent accidental
> > > > changes after it becomes visible to the page faults. Lock the vma before
> > > > adding it into the vma tree.
> > >
> > > So my main reaction here is that I started to wonder about the vma allocation.
> > >
> > > Why doesn't vma_init() do something like
> > >
> > > mmap_assert_write_locked(mm);
> > > vma->vm_lock_seq = mm->mm_lock_seq;
> > >
> > > and instead we seem to expect vma_lock_alloc() to do this (and do it
> > > very badly indeed).
> > >
> > > Strange.
> > >
> > > Anyway, this observation was just a reaction to that "not strictly
> > > necessary to lock a newly created vma" part of the commentary. I feel
> > > like we could/should just make sure that all newly created vma's are
> > > always simply created write-locked.
> > >
> >
> > I thought the same thing initially, but Suren pointed out that it's not
> > necessary to hold the vma lock to allocate a vma object. And it seems
> > there is at least one user (arch/ia64/mm/init.c) which does allocate
> > outside the lock during ia64_init_addr_space(), which is fine but I'm
> > not sure it gains much to do it this way - the insert needs to take the
> > lock anyways and it is hardly going to be contended.
>
> Yeah, I remember discussing that. At the time of VMA creation the
> mmap_lock might not be write-locked, so mmap_assert_write_locked()
> would trigger and mm->mm_lock_seq is not stable. Maybe we can
> necessitate holding mmap_lock at the time of VMA creation but that
> sounds like an unnecessary restriction. IIRC some drivers also create
> vm_are_structs without holding mmap_lock... I'll double-check.
Yeah, there are places like an initcall gate_vma_init() which call
vma_init(). I don't think these are called with a locked mmap_lock.
>
> >
> > Anywhere else besides an address space setup would probably introduce a
> > race.
> >
> > Thanks,
> > Liam
> >