Handle updating of ACCESSED and DIRTY in hugetlb_fault()

From: David Gibson
Date: Tue Aug 19 2008 - 20:39:25 EST


The page fault path for normal pages, if the fault is neither a
no-page fault nor a write-protect fault, will update the DIRTY and
ACCESSED bits in the page table appropriately.

The hugepage fault path, however, does not do this, handling only
no-page or write-protect type faults. It assumes that either the
ACCESSED and DIRTY bits are irrelevant for hugepages (usually true,
since they are never swapped) or that they are handled by the arch
code.

This is inconvenient for some software-loaded TLB architectures, where
the _PAGE_ACCESSED (_PAGE_DIRTY) bits need to be set to enable read
(write) access to the page at the TLB miss. This could be worked
around in the arch TLB miss code, but the TLB miss fast path can be
made simple more easily if the hugetlb_fault() path handles this, as
the normal page fault path does.

Signed-off-by: David Gibson <david@xxxxxxxxxxxxxxxxxxxxx>

---

RFC, looking to merge for 2.6.28.

Index: working-2.6/mm/hugetlb.c
===================================================================
--- working-2.6.orig/mm/hugetlb.c 2008-08-19 15:14:51.000000000 +1000
+++ working-2.6/mm/hugetlb.c 2008-08-19 15:28:27.000000000 +1000
@@ -2008,7 +2008,7 @@ int hugetlb_fault(struct mm_struct *mm,
entry = huge_ptep_get(ptep);
if (huge_pte_none(entry)) {
ret = hugetlb_no_page(mm, vma, address, ptep, write_access);
- goto out_unlock;
+ goto out_mutex;
}

ret = 0;
@@ -2024,7 +2024,7 @@ int hugetlb_fault(struct mm_struct *mm,
if (write_access && !pte_write(entry)) {
if (vma_needs_reservation(h, vma, address) < 0) {
ret = VM_FAULT_OOM;
- goto out_unlock;
+ goto out_mutex;
}

if (!(vma->vm_flags & VM_SHARED))
@@ -2034,10 +2034,23 @@ int hugetlb_fault(struct mm_struct *mm,

spin_lock(&mm->page_table_lock);
/* Check for a racing update before calling hugetlb_cow */
- if (likely(pte_same(entry, huge_ptep_get(ptep))))
- if (write_access && !pte_write(entry))
+ if (unlikely(!pte_same(entry, huge_ptep_get(ptep))))
+ goto out_page_table_lock;
+
+
+ if (write_access) {
+ if (!pte_write(entry)) {
ret = hugetlb_cow(mm, vma, address, ptep, entry,
pagecache_page);
+ goto out_page_table_lock;
+ }
+ entry = pte_mkdirty(entry);
+ }
+ entry = pte_mkyoung(entry);
+ if (huge_ptep_set_access_flags(vma, address, ptep, entry, write_access))
+ update_mmu_cache(vma, address, entry);
+
+out_page_table_lock:
spin_unlock(&mm->page_table_lock);

if (pagecache_page) {
@@ -2045,7 +2058,7 @@ int hugetlb_fault(struct mm_struct *mm,
put_page(pagecache_page);
}

-out_unlock:
+out_mutex:
mutex_unlock(&hugetlb_instantiation_mutex);

return ret;

--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/