Re: page fault fastpath patch v2: fix race conditions, stats for8,32 and 512 cpu SMP

From: Rajesh Venkatasubramanian
Date: Mon Aug 23 2004 - 18:34:07 EST




On Mon, 23 Aug 2004, Christoph Lameter wrote:

> Would it not be better if exit_mmap would hold a writelock on
> mm->mmap_sem and the page_table_lock? This would insure that page faults
> would not change pte's. exit_mmap changes the memory map so it seems that
> the wrong lock is used here.

exit_mmap() is only called from kernel/fork.c/mmput(). We call exit_mmap()
only if (atomic_dec_and_lock(&mm->mm_users, &mmlist_lock)).

So there are no other active thread (mm_user) other than the current
exit_mmap() thread. This gives thread exclusion. So we don't need
mm->mmap_sem.

However, we have to lock out truncate()->zap_pmd_range(), rmap.c
functions, and other places from walking the page tables while we
are freeing the page tables in exit_mmap(). The page_table_lock in
exit_mmap() provides that exclusion.

That's my understanding. Correct me if I am wrong.

Thanks,
Rajesh

> On Wed, 18 Aug 2004, William Lee Irwin III wrote:
>
> > On Wed, 18 Aug 2004, William Lee Irwin III wrote:
> > >> exit_mmap() has removed the vma from ->i_mmap and ->mmap prior to
> > >> unmapping the pages, so this should be safe unless that operation
> > >> can be caught while it's in progress.
> >
> > On Wed, Aug 18, 2004 at 08:07:24PM -0400, Rajesh Venkatasubramanian wrote:
> > > No. Unfortunately exit_mmap() removes vmas from ->i_mmap after removing
> > > page table pages. Maybe we can reverse this, though.
> >
> > Something like this?
> >
> >
> > Index: mm1-2.6.8.1/mm/mmap.c
> > ===================================================================
> > --- mm1-2.6.8.1.orig/mm/mmap.c 2004-08-16 23:47:16.000000000 -0700
> > +++ mm1-2.6.8.1/mm/mmap.c 2004-08-18 17:18:26.513559632 -0700
> > @@ -1810,6 +1810,14 @@
> > mm->map_count -= unmap_vmas(&tlb, mm, mm->mmap, 0,
> > ~0UL, &nr_accounted, NULL);
> > vm_unacct_memory(nr_accounted);
> > + /*
> > + * Walk the list again, actually closing and freeing it.
> > + */
> > + while (vma) {
> > + struct vm_area_struct *next = vma->vm_next;
> > + remove_vm_struct(vma);
> > + vma = next;
> > + }
> > BUG_ON(mm->map_count); /* This is just debugging */
> > clear_page_tables(tlb, FIRST_USER_PGD_NR, USER_PTRS_PER_PGD);
> > tlb_finish_mmu(tlb, 0, MM_VM_SIZE(mm));
> > @@ -1822,16 +1830,6 @@
> > mm->locked_vm = 0;
> >
> > spin_unlock(&mm->page_table_lock);
> > -
> > - /*
> > - * Walk the list again, actually closing and freeing it
> > - * without holding any MM locks.
> > - */
> > - while (vma) {
> > - struct vm_area_struct *next = vma->vm_next;
> > - remove_vm_struct(vma);
> > - vma = next;
> > - }
> > }
> >
> > /* Insert vm structure into process list sorted by address
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/