Re: Page fault scalability patch V18: Drop first acquisition of ptl

From: Andrew Morton
Date: Wed Mar 02 2005 - 21:05:44 EST


Christoph Lameter <clameter@xxxxxxx> wrote:
>
> ...
> static int do_wp_page(struct mm_struct *mm, struct vm_area_struct * vma,
> unsigned long address, pte_t *page_table, pmd_t *pmd, pte_t pte)
> @@ -1306,22 +1308,25 @@ static int do_wp_page(struct mm_struct *
> flush_cache_page(vma, address);
> entry = maybe_mkwrite(pte_mkyoung(pte_mkdirty(pte)),
> vma);
> - ptep_set_access_flags(vma, address, page_table, entry, 1);
> - update_mmu_cache(vma, address, entry);
> + /*
> + * If the bits are not updated then another fault
> + * will be generated with another chance of updating.
> + */
> + if (ptep_cmpxchg(page_table, pte, entry))
> + update_mmu_cache(vma, address, entry);
> + else
> + inc_page_state(cmpxchg_fail_flag_reuse);
> pte_unmap(page_table);
> - spin_unlock(&mm->page_table_lock);
> + page_table_atomic_stop(mm);
> return VM_FAULT_MINOR;
> }
> }
> pte_unmap(page_table);
> + page_table_atomic_stop(mm);
>
> /*
> * Ok, we need to copy. Oh, well..
> */
> - if (!PageReserved(old_page))
> - page_cache_get(old_page);

hm, this seems to be an unrelated change. You're saying that this page is
protected from munmap() by munmap()'s down_write(mmap_sem), yes? What
stops memory reclaim from freeing old_page?

> static int do_swap_page(struct mm_struct * mm,
> struct vm_area_struct * vma, unsigned long address,
> @@ -1727,12 +1733,11 @@ static int do_swap_page(struct mm_struct
> grab_swap_token();
> }
>
> - mark_page_accessed(page);
> + SetPageReferenced(page);

Another unrelated change. IIRC, this is indeed equivalent, but I forget
why. Care to remind me?


Overall, do we know which architectures are capable of using this feature?
Would ppc64 (and sparc64?) still have a problem with page_table_lock no
longer protecting their internals?

I'd really like to see other architecture maintainers stand up and say
"yes, we need this".

Did you consider doing the locking at the pte page level? That could be
neater than all those games with atomic pte operattions.

We need to do the big page-table-walker code consolidation/cleanup. That
might have some overlap.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/